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(57) Abstract: The present invention provides a method tor detecting and differentiating disease states with high sensitivity and 
^5 specificity. The method allows for a determination of whether a cellbased sample contains abnormal cells and, for certain diseases, 

is capable of determining the histologic type of disease present. The method detects changes in the level and pattern of expression 
^ of the molecular markers in the cell-based sample. Panel selection and validation procedures are also provided. 



wo 2004/025251 



PCT/US2003/028379 



CELL--BASED DETECTION AND 
DIFFERENTIATION OF DISEASE STATES 

CROSS REFERENCE TO RELATED APPLICATIONS . 
[0000,1] The present application is . a continuation-in-part of U.S. Application Serial No. ■ 
10/095,298, filed March 12, 2002, which claims the benefit of U.S. Provisional Application 
Serial No. 60/274,638, filed March 12, 2001, the entire contents of which are incorporated by 
reference herein. 

■ ■ ■ ■ ' 

BACKGROUND OF THE INVENTION 

[0001] The present invention relates to early detection of a general disease state in a patient. 

The present invention also relates to discrinoination (differentiation) between specific disease 

states in their early and later stages. 

[0002] Early detection of a specific disease state can gready improve a patient's chance for 
survival by permitting early diagnosis and early treatment while the disease is still localized and 
its pathologic effects limited anatomically, physiologically, and clinically. Two key evaluative 
measnres of any test or disease detection method are its sensitivity (Sensitivity = True 
Positives/(True Positives + False Negatives) and specificity (Specificity = True Negatives/(False 
Positives t True Negatives), which measure how well the test performs to accurately detect all 
affected individuals without exception, and without falsely including individuals who do not 
have the target disease. Historically, many diagnostic tests have been- criticized due to poor 
sensitivity and specificity. 

[0003] Sensitivity, is a measure of a test's ability to detect correctly the target disease in an 
iiadividual being tested. A test having poor sensitivity produces a high rate of false negatives, 
i.e., individuals who have the disease but are falsely identified as being firee of that particular 
disease. The potential danger of a false negative is that the diseased individual will remain 
undiagnosed and untreated for some period of time, during which the disease may progress to a 
later stage wherein treatments, if any, may be less effective. Tliis may result in poorer patient 
outcomes. An example of a test that has low sensitivity is a protein-based blood test for HIV. 
This type of test exhibits poor sensitivity because it fails to detect the presence of the virus until 
the disease is well estabhshed and the virus has invaded the bloodstream in substantial numbers. 
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In contrast, an example of a test that has high sensitivity is viral-load detection using the 
polymerase chain reaction (PGR). High sensitivity is achieved because this type of test can 

detect very small quantities of tlie virus (see Lewis, D.R. et ai ^'Molecular Diagnostics: The 
Genomic Bridge Between Old and New Medicine: A White Paper on the Diagnostic Technology 
and Services Industry" Thomas Weisel Partners, June 13, 2001). 

[0004] Specificity, on the other hand, is a measure of a test's ability to identify accurately 
patients who are free of the disease state. A test having poor specificity produces a high rate of 
false positives, i.e., individuals who are falsely identified as having the disease. A drawback of 
false positives is that they force patients to undergo unnecessary medical procedures treatments 
with their attendant risks, emotional and financial stresses, and which could have adverse effects 
on the patient's health. A feature of diseases- which makes it difficult to develop diagnostic tests 
with high specificity is that disease mechanisms often involve a plurality of genes and proteins. 
Additionally, certain proteins may be elevated for reasons unrelated to a disease state. An 
example of a test that has high specificity is a gene-based test that can detect a p53 mutation. A 
p53 mutation will never be detected unless there are cancer cells present (see Lewis, D.R. et ai 
''Molecular Diagnostics: The Genomic Bridge Between Old and New Medicine: A White Paper 
on the Diagnostic Technology and Services Industry" Thomas Weisel Partners, June 13, 2001). 
[0005] Cellular markers are naturally occurring molecular structures within cells that can be 
discovered and used to characterize or differentiate cells in health and disease. Their presence 
can be detected by probes, invented and developed by human beings, which bind to markers 
enabling the markers to be detected through visualization and/or quantified using imaging 
systems. Four classes of cell-based marker detection technologies are cytopathology, cytometry, 
cytogenetics and proteornics, which are identified and described below. 
[0006] Cytopathology relies upon the visual assessment by human experts of 
cytomorphological changes within stained whole-cell populations. An example is the cytological 
screening and cytodiagnosis of Papanicolaou-stained (i.e., Pap smear) cervical-vaginal specimens 
by cytotechnologists and cytopathologists, respectively. Unlike cytogenetics," proteornics and 
cytometry, cytopathology is not a quantitative tool. While it is the state-of-the-art in clinical 
diagnostic cytology, it is subjective and the diagnostic results are often not highly sensitive or 
reproducible, especially at early stages of cancer (e.g., ASCUS, LSIL). 
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[0007] Tests that rely on morphological analyses involve observing a sample of a patient's cells 
under an optical microscope to identify abnormalities in cell and nuclear shape, size, optical 
texture, or staining behavior. When viewed through a microscope, normal mature epithelial cells 
appear large and well differentiated, with condensed nuclei. Cells characterized by dysplasia, • 
however, may be in a variety of stages of differentiation, with some cells being, very immature. 
Finally, cells characterized by invasive carcinoma often appear undifferentiated, with very, little 
cytoplasm and relatively large nuclei. 

[0008) A drawback to diagnostic tests that rely on morphological analyses is that cell 
morphology is a lagging indicator. Since form follows function, often the disease state has 
aheady progressed to a critical, or advanced stage by the tiine the disease becomes evident by 
morphological analysis. The initial stages of a disease involve chemical changes at a molecular 
level. Changes that are detectable by viewing cell features under a microscope are typically not 
apparent until later stages of the disease. Therefore, tests that measure chemical changes on a 
molecular level, referred to as "molecular diagnostic" tests, are more likely to provide early 
detection than tests that rely on morphological analyses alone. 
[0009] Cytometry is based upon the flow-microfluorometxic instrumental analysis of 
fluorescently stained cells moving in single file in solution (flow cytometry) or the computer- 
aided microscope instrumental analysis of stained cells deposited onto glass microscope slides 
(image cytometry). Flow cytometry applications include leukemia and lymphoma 
immunophenotyping. Image cytometry applications include DNA ploidy, Malignancy- 
Associated Changes (MACs), cell-cycle kinetics and S-phase analyses. The flow and image 
cytometry approaches yield quantitative data characterizing the cells in suspension or on a glass 
microscope slide. Flow and image cytometry can produce good marker detection and. 
differentiation results depending upon the sensitivity and specificity of the cellular stains and 
flow/image measurement features usbd. 

[0010] . Malignancy- Associated Changes (MACs) have been qualitatively observed and reported 
since the early to mid-1 900's (OC Gruner; "Study of the changes met with leukocytes in certain . 
cases of malignant disease" in Brit J Surg 3: 506-522, 1916) (HE Neiburgs, FG Zak, DC Allen, H 
Reisman, T Clardy: "Systemip cellular changes in material from hiiman and animal tissues" in 
Transactions, 7* Ann Mtg Inter Soc Gytol Council, pp 137-144, 1959). From themid-1900's 
through 1975, MACs were documented in independent qualitative histology and cytology studies 
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in buccal mucosa and buccal smears (Nieburgs, Finch, Klawe), .duocleniun (Nieburgs), liver 
(Ellas, Nieburgs), megakaryocytes (Ramsdahl), cervix (Nieburgs, Howdon), skiri (K>yitikeii), 
blood and bone marrow (Nieburgs), monocytes and leukocytes (van Haas, Matison, Clausen),' 
and lung and sputiim (Martuzzi and Oppen Toth). Before 1975 these qualitative studies reported 
MAC-based sensitivities for specific disease detection from 76% to 97% and specificities fironi 
50% to 90%. In 1 975 , Oppen Toth reported a sensitivity of 76% and specificity of 81% in a 
qualitative sputiun analysis study. 

[0011] Quantitative observations regarding iVIAC-based probe analysis began two to three 
decades ago (H Klawe, J Rowinski:. ''Malignancy associated changes (MAC) in cells of buccal 
smears detected by means of objective image analysis" in Acta Cytol 18: 30-33, 1974) (GL 
Wied, PH Bartels, M Bibbo, JJ Sychra: "Cytomorphometric markers for uterine cancer in. 
intermediate cells" in. Analyt Quant Cytol 2: 257-263, 1980) (G Burger, U Jutting, K 
Rodenacker: "Changes in benign population in cases of cervical cancer and its precursors" in 
Analyt Quant Cytol 3: 261-271, 1981). MACs were documented in independent quantitative- 
histology and cytology studies in buccal mucosa and smears Klawe, Burger), cervix (Wied. 
Burger, Bartels, Vooijs, Reixihardt, Rosenthal, Boon, Katzke, Haroske, Zahniser), breast (King, 
Bibbo, Susnik), bladder and prostate (Sherman, Montironi), colon (Bibbo), lung and sputum 
(Swank, MacAulay, Payne), and nasal mucosa (Reith) studies >yith MAC-based sensitivities from 
70% to 89% and specificities fi-om 52% to 100%. Marek and Nakhosteen showed (1999, 
American Thoracic Society annual meeting) the results from two quantitative ptdmonary 
(bronchial washings) studies showing (a) sensitivity of 89%> and specificity of 92%), and (b) 
. sensitivity of 91% and specificity of 100%). 
[0012]. Clearly, Malignancy- Associated Changes (MACs) are potentially useful probes that 
result from the irhage-cytometry marker detection technology. MAC-based features from DNA- 
stained nuclei .can be used in conjunction with other molecular diagnostic probes, to create 
optimized molecular diagnostic panels for the detection and differentiation of lung cancer and 
other disease states. 

[0013] Cytogenetics detects specific chromosonie-based intracellular changes using, for 
example, in situ hybridization (ISH) technology. ISH technology can be based upon 
fluorescence (FISH), multi-color fluorescence (M-FISH), or light-absorption-based 
chromogenics imaging (CHRISH) technologies. The family of ISH technologies uses DNA or 
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RNA probes to detect the presence of the complementary DNA sequence in cloned bacterial or 
culttired eukaryotic cells. FISH technology can, for example, be used for the detection of genetic 
abnormalities associated with certain cancers. Examples include probes for Trisomy 8 and HER- 
: 2 neu. Other highly sensitive as well as specific technologies such as polymerase chain reactions 
(PGR) can be used to detect B-cell and T-cell gene rearrangements. Cytogenetics is a highly 
specific marker detection teclino logy since it detects the causative or ^'trigger" molecular event 
producing a pathology condition. It may, in general, be less sensitive than the other marker 
detection technologies because fewer events may. be present to detect. In situ hybridization (ISH) 
is a molecular diagnostic method that uses gene-based analyses to detect abnormalities on the 
genetic level such as mutations, cliromdsome errors or genetic material inserted by a specific 
pathogen. For example, in situ hybridization may involve measuring the level of a specific 
mRNA by treating a sample of a patient's cells with labeled primers designed to hybridize to the 
specific mRNA, washing away unbound primers and measuring the signal of the label. Due to 
the uniqueness of gene sequences, a test involving the detection of gene sequences will likely 
have a high specificity, yielding very few false positives. However, because the amount of 
genetic material in a sample of cells may be very low, only a very weak signal may be obtained. 
Therefore, in situ hybridization tests that do not employ pre-amplificatidn techniques will likely 
have a poor specificity, yielding many false negatives. 

[0014] Proteomics depends upon cell characterization and differentiation resulting.from the 
over-expression, imder-expression, or presence/absence of unique or specific proteins in 
populations of normal or abnormal cell types. Proteomics includes not only the identification 
and quantification 'of proteins, but also the determination of their localization, modifications, 
interactions, chemical activities, and cellular/extracellular fimctions. Immunochemistry (IC) 
(immunocytochemistry in cells and inimimohistochemistry (IHC) in tissues) is the technology 
used, either qualitatively or quantitatively (QIHC) to stain antigens (i.e., proteomes) using 
antibodies. Inmiunostaining procedures use a dye as the detection indicator. Examples of EEC 
applications include analyses for ER (estrogen receptor), PR progesterone receptor), p53 tumor 
suppressor genes, and EGRF prognostic markers. Proteomics is typically a more sensitive 
marker detection technology than cytogenetics because there are often orders of magnitude more 
. protein molecules to detect using proteomics than there are cytogenetic mutations or gene- 
sequence alterations to detect using cytogenetics. However, proteomics may have a poorer 
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Specificity than tlie cytogenetic marlcer detection technology since multiple pathologies may 
result in similar changes in protein over-expression or under-expression. Inamunochemistry 
involves histological or cj^o logical localization o f .uranunoreac tive substances in tissue sections 
or cell preparations, respectively, often utilizing labeled antibodies as probe reagents. 
ImmLuiochemistry can be used to measure the concenfa-ation of a.disease marlcer (specific 
protein) in a sample of cells by treating the cells with an agent such as a labeled antibody (probe) 
that is specific for an epitope on the disease marker, then washing away unbound antibodies and 
measuring the signal of the label. Immunochemistry is based on the property that cancer cells 
possess different levels of certain disease markers than do healthy cells. The concentration of a 
. disease marker in a cancer cell is generally large enough to produce a large signal. Therefore, 
tests that rely on immunochemistry will likely have a liigh sensitivity, yielding few false 
negatives. However, because other factors in addition to the disease state may cause the 
concentration of a disease marker to become raised or lowered, tests that rely on 
immunochemical analysis of a specific disease marker will likely have poor specificity, yielding 
a high rate of false positives. 

(0015] The present invention provides for a noninvasive disease state detection and 
discrimination method with both high sensitivity and high specificity. This method is useful for 
patient screening. The present invention also provides a disease state detection, and 
discrimination method with both high sensitivity and high specificity. This method is useful for 
patient diagnosis and therapeutic monitoring. The method involves contacting a cytological 
sample or multiple samples suspected of containing diseased cells with a panel of probes 
comprising a plurality of agents, each of which quantitatively binds to a specific disease marker, 
and detecting and analyzing the pattern of binding of the probe agents. The present invention 
also provides methods of constructing and validating a panel of probes for detecting a specific 
disease (or group of diseases) and discriminating among its various disease states. Illustrative 
panels for detecting lung cancer and discriminating among different types of lung cancer are also 
provided. Illustrative panels or other cancers and non-cancer disease states are also provided. . 
10016] . A human disease results firom the failure of the human organism's adaptive mechanisms 
. to neutralize external (i.e., local or global environmental) or internal insults which result in 
abnormal structures or functions within the body's cells, tissues, organs or systems. Diseases can 
be grouped by shared mechanisms of causation as illustrated below, in Table I . 
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Table 1: 









Adverse reactions to foods and plants 


\^ oi UlU V doC Ui oT 


Heart failure, atherosclerosis 


jjegeiicraLive v^neuroiogicai and 
muscular) 


Alzheimer's and Parkinson's 


Diet 


Non-nutritional substances and 

excess/imbalanced nutrition 


Hereditary 


Sickle cell anemia, cystic fibrosis 


Immune 


HIV and autoimmune 


Infection 


Viral, bacterial, fungal, parasitic 


Metabolic 


Diabetes 


Molec:ular and cell biology 


Cancer (neoplasia) 


Toxic insults 


Alcohol, drugs, environmental 
mutagens and carcinogens 


Trauma 


Bodily injury from automobile collision 



[0017] Disease states are either caused by or result in abnormal changes (i.e., pathological 
conditions) at a subcellular, cellular, tissue, organ, or human anatomic or physiological system 
level. Many disease states (e.g., lung cancer) are characterized by abnormal changes at a 
subcellular or cellular level. Specimens (e.g., cervical Pap smears, voided urine, blood, sputum," 
colonic washings) can be collected from patients with suspected disease states to diagnose those 
patients for the presence and type of the disease state. Molecular pathology is the discipline that 
attempts to identify and diagnostically exploit the molecular changes associated with these cell- 
based diseases. 

[0018] Lung cancer is an illustrative example of a disease state in which screening ofhigh-ri.sk 
populations and at-risk individuals can be performed using diagnostic tests (e.g., molecular 
diagno'stic panel assays) to detect the presence of the disease state . Also, for patients in which 
lung cancer or other disease states have been detected by these means, related diagnostic tests can 
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be employed to differentiate the specific disease state from. related or co-occurring disease states. 
For example, in this lung, cancer illustration, additional molecular diagnostic panel assays may 
indicate the probabilities that the patient's disease state is consistent with one of the following 
types, of lung cancer: (a) squamous cell carcinoma of die lung, (b) adenocarcinoma of the lung, 
(c) large cell carcinoma of the lung, (d) small cell carcinoma of the lung, or (e) mesothelioma. 
Early detection and differentiation of cell-based disease states, is a hypothesized means to 
improve patient outcomes, 

[0019] Cancer is a neoplastic disease, the natural course of which is fatal. Cancer cells, unlike 
benign turner cells, exhibit the properties of invasion and metastasis and are highly anaplastic. 
Cancer includes the three broad categories of carcinoma (i.e., epithelial cell-based cancers), 
sarcoma (e.g., bone-based cancers), and blood^based cancers (e.g., leukemia and lymphoma), but 
in lay usage each of the three types is often referred to synonymously with carcinoma. 
According to the World Health Organization (WHO),, cancer affects more than 10 million people 
each year and is responsible for in excess of 6.2 million deaths. 

(00201 Cancer is, in reality, a. heterogeneous collection of diseases that can occur in virtually 
any part of the body. As a result,, different treatments are not equally effective in all cancers or 
even among the stages of a specific type of cancer. Advances in diagnostics (e.g., 
. mammography, cervical cytology, and semm PSA testing) have, in some cases, allowed for the 
detection of early-stage cancer when there are a greater niunber of treatment options, and 
therapies tend to be more effective. In cases where a soUd tiunor is small and localized, surgery 
alone may be sufficient to produce a cure. However, in cases where the tumor has spread, 
surgery may provide, at best, only limited benefits. In such cases the addition of chemotherapy 
and/or radiation therapy may be used to treat metastatic disease. While somewhat effective in 
prolonging life, treatment of patients with non-blood-based metastatic disease rarely produces a . 
cure. Even through there may be an initial response, with time the disease progresses and the 
patient ultimately dies from its eflfects and/or from the toxic effects of the treatments. 
[0021] While not proven, it is generally accepted that early detection and treatment will reduce 
the morbidity, mortality and cost of cancer. Early detection will, in many cases, permit treatment 
to be initiated prior to metastasis. Furthermore, because there are a greater number of treatment 
options, there is a higher probability of acliieving a cure or significant improvement in long-term 
siuvival. 
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[0022] Developing a test that can be used to screen an "at-risk" population has long been a goal 
of health practitioners. Wliile there have been some successes such as mammography for breast 
cancer, PSA testing for prostate cancer, and the Pap smear for cervical cancer, in most cases 
: cancer is detected at a relatively late stage where the patient is symptomatic and the disease is 
ahnost always fatal. For most cancers, no test or combination of tests has exhibited the necessary 
sensitivity and specificity to permit cost-effective identification of patients with early stage 
disease. 

[0023] For a cancer screening program to be successful and gain acceptance by patients, 
physicians, and third-party payers, the test must have implied benefit (changes the outcome), be 
widely available and be able to be carried out readily within the framework of general healthcare. 
The test should be relatively noninvasive, leading to adequate compliance, have high sensitivity, 
and reasonable specificity and predictive value. In addition, the test must be available at 
relatively low cost. . 

[0024] For patients who are suspected of having cancer, the diagnosis must be confirmed and 
the tumor properly staged cytologically and clinically in order for physicians to undertake 
appropriate therapeutic intervention. Some tests currently being used in the diagnosis and 
staging of cancer, however, either lack sufficient sensitivity or specificity, are too invasive, or are 
too costly to justify their use as a population-based screening test. Shown below in Tables 2 and 
3, for example, are estimates of sensitivity and specificity of lung cancer diagnostics and 
estimated costs (U.S. dollars) for diagnostic tests used to detect lung cancer. 
Table 2: 

Estimates of Sensitivity and Specificity of Lung Cancer Diagnostics [1] 









Conventional Sputum Cytology 


. 5L0 


100.0 


Chest X-ray 


16-85* 


90-95 


White Light Bronchoscopy 


48.0-80.0 


91.1-96.8 


LIFE Bronchoscopy 


72.0 


86.7 


Computed Tomography 


63.0-99.9 


80.0-61 


PET Scan 


88.0-92.5 


83.0-93.0 



*Dependent upon the stage of the disease at the time of diagnosis 
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Tables: 

Estimated Costs for Diagnostic Tests Used in Lung Cancer (l] 



DIAGNOSTIC TEST 


COST(S) 


Spiituna Cytology 


90 


Chest X-ray 


44 


Bronchoscopy 


725 


Computed Tomography 


378 


PET Scan 


800-3000 


Open Biopsy 


■ 12,847-14,121 



[0025] The chest radiograph (X-ray) is often used to detect and localize cancer lesions due to 
its reasonable sensitivity, high specificity and low cost. However, small lesions are often 
difficult to detect and although larger tumors are relatively easy to visualize on a chest film, at 
the time of detection most have already metastasized. Thus, chest X-rays lack the necessary 
sensitivity for use as an early detection method. 

[0026] Computed tomography (CT) is useful in the confirmation and characterization of 
pulmonary nodules and allows the detection of subtle abnormaUties that are often missed on a 
standard chest X-ray [2]. CT, and Spiral CT methods in particular, remains die test of choice for 
patients who present with a prior malignant sputum cytology result or vocal chord paralysis. CT, . 
with its improved sensitivity over the conventional chest film, has become the primary tool for 
imaging the central airway [3]. While capable of examining large areas, CT is subject to 
artifacts firom cardiac and respiratory motion although improved resolution can be achieved 
through the use of lodinated contrast material. 

[0027] Spiral CT is a more rapid and sensitive form of CT that has the potential to detect early 
cancer lesions more reliably than either conventional CT or X-ray. Spiral CT appears to have 
greatly improved sensitivity in diagnosing early disease. However, the test has relatively low. 
specificity with a 20% false positive rate [4]. As the resolution of Spiral CT instruments improve 
by engineering teclinology advances, the false positive rate is likely to increase. Spiral CT is also, 
less sensitive in detecting the central lesions that represent one-third of all lung cancers. 
Furthennore, while the cost of the initial test is relatively low ($300), the cost of follow-up can 
be at least an order of magnitude higher. Cytology using molecular diagnostic panel assays 
offers significant promise as an adjunctive test with Spiral CT to improve the specificity of Spiral 
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CT testing by minimizing false positive results through the evaluation of fine* needle aspirations 
(FNAs) or biopsies (FNBs) from Spiral CT-suspicious pulmonary nodules. 
[0028] Fluorescence bronchoscopy provides increased sensitivity over conventional white light 
bronchoscopy, significantly improving the detection of small lesions within the central airway 
[5]. However, fluorescence bronchoscopy is unable to detect peripheral lesions, it takes a long 
time for bronchoscopists to examine a patient's airways, and it is an expensive procedure. 
Additionally, the procedure is moderately invasive, creating an insurmountable barrier to its use 
as a population-based screening test. 

[0029] Positron Emission Tomography (PET) is a highly sensitive test that utilizes radioactive 
glucose to identify the presence of cancer cells within the luiig [6-8]: The cost of establishing a 
testing facility is high and there is the need for a cyclotron on site or nearby. Also, implementing 
centralized testing is a logistical problem. This, coupled with the high cost of the test, has limited 
the use of PET scans to staging lung cancer patients rather than for early detection of the disease. 
[0030] Although used for some time as a means of screening for lung cancer, sputum cytology 
has enjoyed only limited success due to its low sensitivity and its failure to reduce disease- 
specific mortality. In conventional sputum cytology, the pathologist uses characteristic changes 
in cellular moiphology to identify malignant cells and make a diagnosis of cancer. Today only 
15% of patients who are ^at-risk" or who are suspected of having lung caiicer undergo sputum 
cytology testing, and less than 5% undergo multiple evaluations [9]. A number of factors 
including tumor size, location, degree of differentiation, cell clumping, inefficiency of clearing 
mechanisms to release cells and sputum to the external environment, and the poor stability of 
cells within the sputum contribute to the overall poor performance of the test. 
[0031] Cancer diagnostics has traditionally relied upon the detection of single molecular 
markers. Unfortunately, cancer is a disease state in which single markers have typically failed to 
detect or differentiate many forms of the disease. Thus, probes that recognize only a single 
marker have been shown to be largely ineffective. Exhaustive searches for "magic bullet" 
diagnostic tests have been underway for many decades though no universal successful magic 
bullet probes have been found to date. 

[0032] A major premise of this invention is that cell-based cancer diagnostics and the 
screening, diagnosis for, and therapeutic monitoring of odier disease states will be significantly 
improved over the state-of-the-art that uses single marker/probe analyses rather than kits of 
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multiple, simultaneously labeled probes. This multiplexed analytical approach is particularly 
well suited for cancer diagnostics since cancer is not a single disease. Furthermore, this multi- 
factorial "panel" approach is consistent with the heterogeneous nature of cancer, both 

cytologically and clinically. 

[0033] Key to the successful implementation of a panel approach to cell-based diagnostic tests 
is the design and development of optimized panels of probes that can chemically recognize the 
pattern of markers that characterizes and distinguishes a variety of disease states. This patent 
application describes an efficient and unique methodology to design and develop such novel and 
optimized panels. 

[0034] Improved methods for specimen. collection (e.g., point-of-care mixers for sputum 
cytology) and preparation (e.g., new cytology preservation and transportation fluids, and liquid- 
based cytology preparation instruments) are under development and becoming conunercially 
available. In conjunction with existing and these emerging methods, a successful implementation 
of this molecular diagnostics cell-based panel assay will lead to (a) characterization of the 
molecular profile of malignant tumors and other disease states, (b) improved methods for early 
cancer and other disease state detection and differentiation, and (c) opportunities for improved 
clinical diagnoses, prognoses, customized patient treatments, aiid therapeutic monitoring. 

SUMMARY OF THE INVENTION 
[0035] The present invention is directed to a panel for detecting a generic disease state or 
discriminating between specific disease states using cell-based diagnosis. The panel comprises a 
plurality of probes each of which specifically binds to a marker associated with a generic or 
specific disease state, wherein the pattern of binding of the component probes of the panel to 
cells in a cytology specimen is diagnostic of the presence or specific nature of said disease state. 
The preseint invention is also directed to a method of fonning a panel for detecting a disease state . 
or discriminating between disease states in a patient using cell-based diagnosis. The method 
involves determining the sensitivity and specificity of binding of probes each of which 
specifically binds to a member of a library of markers associated with a disease state and 
selecting a limited plurality of said probes whose pattern of binding is diagnostic for the presence 
or specific nature of said disease state. The present method is also directed to a method of 
detecting a disease or discriminating between disease states. The method involves contacting a 
cytological sample suspected of containing abnormal cells characteristic of a disease state widi a 
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panel according to claim 1 and detecting a pattern of binding of said probes that is diagnostic for 
the presence or specific nature of said disease state. 

BRIEF DESCRIPTION OF THE FIGURES 
[0036} Figure. 1 . Molecular markers that are preferable markers to be included in a panel for 
identifying different histologic types of lung cancer. The column labeled indicates the 
percentage of tumor specimens that express a particular marker. 

[0037] Figure 2. Potential ways in which different markers may be used to discriminate 
between specific types of lung cancer. SQ indicates squamous cell carcinoma, AD indicates 
adenocarcinoma, LC indicates large cell carcinoma, SC indicates small cell carcinoma and ME 
indicates mesothelioma. The numbers appearing in each cell represent jfrequency of marker 
change in one cell type versus another. To be included in the table, the ratio must be greater than 
2,0 or less than 0.5. A niunber larger than 100 generally indicates that the second marker is not 
expressed. In such cases the denominator was set at 0.1 for the purpose of the analysis. Finally, 
empty cells represent either no difference in expression or the absence of expression data. 
[0038] Figure 3 . Comparisons between H-scbres for probes 7 and 1 5 in control tissue and in 
cancerous tissue. The X-axis shows the H-scores while the Y-axis shows the percent of cases. ■ 
[0039] Figure 4. Correlation matrix, in which correlation measures the amount of linear 
association between a pair of variables. All markers in this matrix with a correlation number of 
50% or higher are considered coirelate markers. Note that all diagonal elements of this 
correlation matrix have a value of 1.0 (i.e., True) because the diagonal elements show auto- 
correlation values (i.e., Probe N correlation to Probe N). Also, note that this matrix is diagonally 
syminetric (he., correlation value of Probe N versus M is identical to the correlation value, of 
Probe M versus N). 

[0040] Figure 5. Detection panel compositions, pair-wise discrimination panel 
compositions and joint discrimination panel compositions. Panel compositions using decision 
tree analysis, stepwise LR and stepwise LD are shown. Note that shaded boxes identify probes 
that are shown to be effective by two or more of these independent analytical methods. 
[0041] Figure 6. Detection paneil compositions wherein probe 7 was not included as a 
probe. Panel compositions using decision tree analysis, stepwise LR and stepwise LD are shown. 
Note that shaded boxes identify probes that are shown to be effective by two or more of these 
independent analytical methods. 
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[0042J Figure 7. ' Detectioa panel compositions using only commercially preferred probes.. 
Panel compositions using decision tree analysis, stepwise LR and stepwise LD are shown. Note 
that shaded boxes identify probes that are shown to be effective by two or more of these 
independent analytical methods. 

[00431 Figures 8a-c. Summary of the preferred markers (probes) for panels for detecting and/or 
diagnosing lung, colorectal, bladder, prostate, breast and cervical cancer. 

DETAILED DESCRIPTION OF THE IJVfVENTION 
1. Introductioa 

[00441 the present invention provides a noninvasive disease state detection and discrimination 
method with higli sensitivity and specificity. The method involves contacting a cytological or 
histological sample or sample suspected of containing diseased cells with a panel comprising a 
plurality of agents, each of which quantitatively binds to a disease marker, and detecting a pattern 
of binding of the agents. This pattern includes the localization and density/concentration of 
binding of the compbnent probes of the panel. The present invention also provides methods of 
making a panel for detecting a disease and also for discriminating between disease states as well 
as panels for detecting limg cancer in early stages and discriminating between different types of 
lung cancer. Panel tests have been used in medicine. For example, panels are. used in blood 
serum analysis. However, because a cytology analysis involves imaging and localization of 
specific markers within individual cells and tissues, prior to the present invention it was not 
apparent that the panel approach would be effective for cytology or histology samples. 
Additionally, it was not apparent which, if any statistical analyses could be appUed to design and 
"develop ah optinuzed cell-based diagnostic panel of probes. 

[0045] One of the few examples of a cytology-based screening program is the Pap Smear, 
which screens for cervical cancer. . For over 50 years this method has been practiced and has 
greatly contributed to the fact that today, almost no woman who. has regular Pap smears dies of 
cervical cancer. There are drawbacks, however, to the Pap smear screening program. For 
example, Pap smears are labor intensive, subject to the variability associated with human 
performance, and are not universally accessible. The present molecular diagnostic cell-based 
screening method utilizing probe panels does not suffer firom these drawbacks. The method may 
be fully automated and thereby made less expensive and reproducible, increasing access to this - 
type of testing. 
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[0046] The present invention provides a method, having both high specificity and high 
sensitivity, for detecting a disease state and for discriminating between disease states. The 
invention is applicable to any cell-based disease state, such as cancer and infectious diseases. 
[0047] The panel is diagnostic of the presence or specific nature of the disease state, The 
present invention overcomes the limitations and drawbacks of known disease state detection 
methods by enabling quick, accurate, relatively noninvasive and easy detection and 
discriniination of diseased cells in a cytological sample while keeping costs lo^^^ 
[0048] A feature of the inventive method for making a panel of the present invention is the 
rapidity with which the panel may be developed. 

[0049] There are several benefits to using a panel of agents in a method for detecting a disease 
state, and for discriminating between types of disease states. One benefit is that a panel of agents 
has sufficient redundancy to permit detection and characterization of disease states thereby 
increasing the sensitivity and specificity of the test. Given the heterogeneous nature of many 
disease states, no single agent is capable of identifying the vast majority of cases. 
[0050] An additional benefit to using a panel is that use of a panel permits discrimination 
between the various types of a disease state based on specific patterns (probe localization and 
density/concentration) of expression. As the various types of a disease may exhibit dramatic 
differences in their rate of progression, response to therapy, and lethality, knowledge of the 
specific type can help physicians choose tlae optimal therapeutic approach. 
2. The Panel 

[0051] The panel of the present invention comprises, a plurality of agents, each of which 
quantitatively binds to a disease marker, wherein the pattern (localization and 
density/concentration) of biriding of the component agents of the panel is diagnostic of the 
presence or specific nature of a disease state. Therefore, the panel may be a detection panel or a 
discrimination panel. A detection panel detects whether a generic disease state is present in a 
sample of cells, while a discrimination panel discriminates among different specific disease states 
in a sample of cells known to be affected by a disease state which comprises different types of ; 
disigases. The difference between a detection panel and a discrimination panel lies in the.specific 
agents that the panels comprise. A detection panel comprises agents having.a pattern of binding 
that is.diagnostic of the presence of a disease state, while a discrimination panel comprises agents 
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having a pattern of binding that allows for determining the specific nature (i.e., each type) of tlie 
disease state. 

[0052] A panel, by definition, contains more than one member. There are several reasons why 
it is beneficial to use a panel of markers rather thati just one marker alone to detect a generic 
disease state or to discrirninate among specific disease states. One reason is the unlikely 
existence of a probe for one single marker, that is present in all diseased cells yet not present in 
healthy cells, whose behavior can be measured witli a high specificity and sensitivity to yield an 
accurate test result. If such a single probe existed for detection of a particular disease with high 
sensitivity and specificity, it would already have been utilized for clinical testing. Rather, it is . 
the directed selection of panel tests, each consisting of multiple probes, that together can provide 
the range of detection capability to ensure clinically adequate testing. 
[0053] If one nevertheless chooses to constnict a panel test comprising one or a very few 
probes, then the failure of any single marker/probe combination to perform its labeling function 
for any reason (for example, diminished reactivity of the specimen cells due to biological 
variability; inherent variability between lots of probe reagents; a weak, outdated or defective 
processing reagent; improper processing time or conditions for that probe) could result in a 
catastrophic failure of the test to detect or discriminate the target disease. The inclusion of 
multiple, and even redundant probes in each panel test greatly enhances the probability that a 
failure of any one probe will not cause a catastrophic failure oif the test. 
[00541- A probe is any molecular structure or substructure that binds to a diseaise marker. The 
tenh ''agent" as used herein, may also refer to a molecular structure or substmcture that binds to a 
disease marker. Molecular probes are homing devices used by biologists and clinicians to detect 
and locate markers indicative of the specific disease states. For example, antibodies may be . 
produced that bind, specifically to a protein previously identified as a marker for small cell lung 
cancer. This antibody probe can then be used to localize the target protein marker in cells and 
tissues of patients suspected of having the disease by using appropriate inununochemical 
protocols and incubations. If the antibody probe binds to its target marker in a stoichiometric, 
(i.e., quantitative) fashion and is labeled with a chromogenic or colored "tag", then locahzation 
and quantitation of the probe and, indirectly, its target marker may be accomplished using an 
optical microscope and image cytometry technology. 
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[0055] The present invention contemplates detecting changes in molecular marker expression 
at the DNA, RNA or protein level using any o f a number of methods available to an ordinary 
sldlled artisan. Exemplary probes may be a polyclonal or monoclonal antibody or fragment 
■ thereof or a nucleic acid sequences that is complementary to the nucleic acid sequence encoding 
a molecular marker in the panel A probe may also be a stain, such as a DNA stain. Many of the 
antibodies used in the present invention are specific to a variety of cell surface or intracellular 
antigens as marker. substances. The antibodies may be syndiesized using techniques generally . 
known to those of skill in the art. For example, after the initial raising of antibodies to the 
marker, the antibodies can be sequenced and siibsequently prepared by recombinant techniques. 
Alternatively, antibodies may be purchased. 

[0056] In embodiments of the present invention, the probe contains a label. A probe containing 
a label is often referred to herein as a *nabeled probe". The label may be any substance that can 
be attached to a probe so that when the probe binds to the marker a signal is emitted or the 
labeled probe can be detected by a human observer or an . analytical instnimenf. This label may 
also be referred to as a "tag". The label may he visualized using reader instrumentation. The 
term "reader instrumentation" refers to the analytical equipment used to detect a probe. Labels 
envisioned by the present invention are any labels that emit a signal and allow for identification 
of a component in a sample. Preferred labels include radioactive, fluorogenic, chromogenic.or 
enzymatic moieties. Therefore, possible methods of detection include, but are not limited to, 
immunocytochemistry, Lmmunohistochemistry, in situ hybridization, fluorescent in situ 
hybridization, flow cytometry, and image cytometry. The signal generated by the labeled probe is 
of sufficient intensity to permit detection by a medical practitioner. 
[0057] A "marker", "disease marker" or. "molecular marker" is any molecular structure or 
substructure that is. correlated with a disease state or pathogen. The term "antigen" may be used 
interchangeably with "marker". Broadly defined, a marker is a biological indicator that may be 
deliberately used by an observer or instrument to reveal, detect, or measure the presence or 
firequency and/or amount of a specific condition, event or substance. For example, a specific and! 
unique sequence of nucleotide bases may be used as a genetic marker to track patterns of genetic 
inheritance among individuals and through families. Similarly, molecular niarkers are specific 
molecules, such ais proteins or protein fi-agments, whose presence within a cell or tissue indicates 
a particular disease state. For example, proUferating cancer cells may expreiss novel cell-surface 
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proteins not found on normal cells of the same type, or may. over-express specific secretory 
proteins whose increased or decreased abundance (e.g., overexpression or underexpression, 

respectively) can serve as markers for a particular disease state. . 

[0058J Suitable markers for cytology panels are substances that are locaUzed in or on the ■ 
nucleus, cytoplasm or cell membrane. Markers may also be localized in organelles located in 
any of these locations in the cell. Exemplary markers localized in the nucleus include but are not 
Umited to retinoblastoma gene product (Rb), Cyclin A, nucleoside diphosphate kinase/nm23, 
telomerase, Ki-67, CycUn Dl, proliferating cell nuclear antigen (PGNA), pi 20 (proliferation- 
associated nucleolar antigen) and thyroid transcription factor I (TTF-1). Exemplary markers 
localized in the cytoplasm include but are not limited to VEGF, surfactant apoprotein A (SP-A), 
nucleoside nm23, melanoma antigen-l (MAGE-l), Mucin 1, siu-factant apoprotein B (SP-B), ER 
related protein p29 and melanoma antigen-3 (MAGE-3). Exemplary markers localized in the cell 
membrane include but are not limited to VEGF. thrombomodulin, CD44v6, E-Cadherin, Mucin 
1, human epithelial related antigen (HERA), fibroblast growth factor (FGF), heptocyte growth 
factor receptor (C-MET), BCL-2, N-Cadherin, epidermal growth factor reiceptor (EGFR) and 
glucose transporter-3 (GLUT-3). An example of a marker located in an organelle of the 
cytoplasm is BCL-2, located (in part) in the mitochondrial membrane. An example of a marker 
located in an organelle of the nucleus is pl20 (proliferating-associated nucleolar antigen), located 
in the nucleoli. 

[0059] Preferred are markers where changes in expression: occur early in disease progression, 
are exhibited by a majority of diseased cells, allow for detection of in excess of 75% of a given 
disease type; most preferably in excess of 90% of a given disease type and/or allow for the 
discrimination between the nature of different types of a disease state. 

[0060] It is noted that the inventive panel may be referred to as a panel of probes or a panel of. 
markers, since the probes bind to the markers. Therefore, the panel may comprise a number of 
markers or it may comprise a number of probes that bind to specific markers. For the sake of 
consistency, the present panel is referred to as a panel of probes; however, it could also be 
referred to as a panel of markers. 

[0061] Markers can also include features such as malignancy-associated changes (MACs) in 
the cell nucleus or features related to the patient's family history of cancer. Malignancy- 
associated changes, or MACs, are typically sub-visual changes that occur in normal-appearing 
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cells located in the vicinity of cancer cells. These exceedingly subtle changes in tlie cell nucleus 
may result biologically from changes in the nuclear matrix and the chromatin distribution pattern. 
They cannot be appreciated.even by trained observers through the visual observation of 
individual cells, but may be determined from statistical analysis of cell populations using highly 
automated, computerized high-speed image cytometry. Techniques for detection of MACs are 
well known to those of skill in the art and are described in more detail in: Gruner, O.C. Brit J. 
Surg. 3 506-522 (1916); Neiburgs, H.E. et ctL, Transaction, Annual Mtg. Inter. Soc. Cytol 
Cownd/ 137-144 (1959); Klawe, H. Acta. Cytol. 18 30-33 (1974); Wied, GX., a/., ^na/f^. 
QuanL Cytol 2 257-263 (1980); and Burger, G., etaL,Analyt. Quant. Cytol 3 261-271 (1981). 
[0062J The present invention encompasses any marker that is correlated with a disease state. 
The individual markers themselves are mere , tools of the preseat invention. Therefore, the 
invention is not limited to specific markers. One way to. classify markers is by their functional 
relationship to other molecules. As used herein, a "flmctionally related" marker is a component 
of the same biological process or pathway as the marker in question and would be Icnown by a 
person of skill in the art to be abnormally expressed together with the marker in question. For 
example, many markers are associated with a cell proliferation pathway, such as fibroblast 
growth factor (FGF), (vascular endothelial growth factor) VEGF, CyclinA and CyclinDl. Other 
markers are glucose transporters, such as Glut- 1 and Glut-3. 

[0063] A person of ordinary skill in the art is well equipped to determine a functionally related* 
marker and may research various markers or perform experiments in wliich the functional 
behavior of a marker is determined. By way of non-limiting example, a marker may be classified 
as. a molecule involved in angiogenesis, a transmembrane glycoprotein, a cell surface 
glycoprotein, a puhnonary siufactant protein, a nuclear DNA-binding phosphoprotein, a - 
transmembrane Ca^"*" dependent cell adhesion molecule, a regulatory subunit of the cyclin- 
dependent kinases (CDK's). a nucleoside diphosphate kinase, a ribonucleoprotein enzyme, a 
nuclear protein that is expressed in proliferating normal and neoplastic cells, a cofactor for DNA 
polymerase delta, a gene that is silent in normal tissues yet when it is expressed in mahgnant 
neoplasms is recognized by autologous, tumor-directed and specific cytotoxic T cells (CTL's), a 
. glycosylated secretory protein, the gastrointestinal tract or genitoiuinary tract, a hydrophobic 
protein of a puhnonary surfactant, a transmembrane glycoprotein, a molecule involved in 
proliferation, differentiation and augiogenesis, a proto-oncogene. a homeodomain transcription 
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factor, a mitochondrial membrane protein,. a molecule found in nucleoli of a rapidly proliferating 
cell, a glucose transporter, or an estrogen-related heat shock protein. 
[0064] Classes of biomarkers and probes include, but are not limited to: (a) morphologic 
biomarkers, including DNA ploidy, MACs and premalignant lesions; (b) genetic biomarkers 
including DNA adducts. DNA mutations and apoptotic indices; (c) cell cycle biomarkers 
including cellular prohferation, differentiation, regulatory molecules and apoptosis markers, and; 
(d) molecular and biochemical biomarkers including oncogenes, tumor suppressor genes, tiunor 
antigens, growth factors; and receptors, enzymes, proteins, prostaglandin levels and adhesion 
molecules. 

[0065] A "disease state" may be any cell-based disease. In some embodiments the disease state 
is cancer. In other embodiments, the disease state is an infectious disease. The cancer may be 
any cancer, including, but not limited to epithehal cell-based cancers from the pulmonary, 
urinary, gastrointestinal, and genital tracts; solid and/or secretory tumor-based cancers, such as 
sarcomas, breast cancer, cancer of the pancreas, cancer of the liver, cancer of the kidneys, cancer 
of the thyroid, and cancer of the prostate; and blood-based cancers, such as leukemias and 
lymphomas. Exemplary cancers which may be detected by the present invention are lung, 
bladder, gastrointestinal, cervical, breast or prostate cancer. Exemplary infectious diseases which 
may be- detected are cell-based diseases in which the infectious organism is a virus, bacteria, 
protozoan, parasite, . or fungus. The infectious disease, for example, may be HIV, hepatitis, 
influenza, meningitis, mononucleosis, tuberculosis and sexually transmitted diseases (STDs), 
such as chlamydia, trichomonas, gonorrhea, herpes and syphilis. 

[0066] As used herein, the term "generic dise^ise state" refers to a disease which comprises 
several types of specific diseases, such as lung cancer, sexually transmitted diseases and 
immune-based diseases. Specific disease states are also referred to as histologic types of 
diseases. For example, the term "lung cancer" comprises several specific diseases, among wlaich 
are squamous cell carcinoma, adenocarcinoma, large cell carcinoma, small cell lung cancer and 
mesothelioma. The term "sexually transmitted diseases" comprises several specific diseases, 
among which are Gonorrhea, Human Papilloma Vims (HPV), herpes and Syphilis. The tenn • 
''immune-based diseases" comprises several specific diseases, such as systemic lupus • 
erythematosus (Lupus), rheumatoid artlmtis and pernicious anemia. 
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[0067] As. used herein, the tenii "high-risk population" refers to a group of individuals who are ' 
exposed to disease causing agents, e.g., carcinogens, either at home or in tlie workplace (i.e., a 
"high risk population" for lung cancer might be exposed to smoking, passive smolcing and 
occupational exposure). Individuals in a "high-risk population" may also have a genetic 
predisposition. . 

[0068] The term "at-risk" refers to individuals who are asymptotic but, because of a family, 
history or significant expostire are at a significant risk of developing a disease state (i.e., an 
individual at risk for lung cancer with a > 30 pack-year history of smoking; "pack-year" is a 
measurement unit computed by multiplying the number of packs smoked per day, times the 
niunber of years for this exposure) . 

[0069J Cancer is a disease in which cells divide without control due to, for example, altered 
gene expression. In the methods and panels of the present invention, the cancer may be any 
maUgnant growth in any organ. For example, the cancer may be lung, bladder, gastrointestinal, 
cervical, breast or prostate cancer. Each cancer may comprise a collection of diseases or 
histological types of cancer. The term "histologic type" refers to cancers of different histology. 
Depending on the cancer there can be one or several histologic types. For example, lung cancer 
includes, but is not limited to, squamous cell carcinoma, adenocarcinoma, large cell carcinoma, 
small cell carcinoma and mesothehoma. Knowledge of the histologic type of cancer affecting a 
patient is very useful because it helps the medical practitioner to localize and characterize the 
disease and to determine the optimal treatment strategy. 

[0070] Infectious diseases include cell-based diseases in which the infectious organism is a 
virus, bacteria, protozoan, parasite or fungus. 

10071] Exemplary detection and discrimination panels are panels that detect lung cancer, a 
general disease state, and panels that discriminate a single lung cancer type, specific disease 
state, against all other types of lung cancer and false positives. False positives can include 
metastatic cancer of a different type, such as metastasized liver, kidney or pancreatic cancer, 

3. Methods of Making a Panel 
[0072] The method of making a panel for detecting a generic disease state or discriminating 
between specific disease states in a patient involves determining the sensitivity and specificity of 
binding of probes to a library of markers associated with a generic or specific disease state and 
selectmg a plurality of said probes whose pattern of binding (localization and 
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density/concentration) is diagnostic of the presence or specific nature of the disease state. In . 
some embodiments, optional preliminary pruning and preparation steps are performed. The 
method of making a panel of the present invention involves analyzing the pattern of binding of 
probes to markers in known histologic pathology samples, i.e. gold standards. The classifier 
designed on the gold standard data can then be used to design a classifier for cytometry, 
especially automated cytometry. Therefore, the set of marker probes selected from the pathology 
analysis is used to prepare a new training data set taken firom a cytology sample, such as sputum, 
fine needle aspirations, urine, etc. Cells shed from the specified lesions will stain in a similar 
fashion to the gold standards. The method described here eliminates the experimental error in 
selecting the best features set because the integrity of the diagnosis based on gold standard 
histologic pathology samples is high. Although it is, in principle, possible to use cytology 
■ sariiples to produce a panel, this is less desirable because, cytology saniples contain debris, there 
may be deterioration of the cells in a cytology sample, and the pathology diagnosis may be . 
difficult to confirm clinically. 

[0073] A library of markers is a group of markers. The library can comprise any number of 
markers. However, in some embodiments the number of markers in the library is limited by 
technical and/or commercial practicalities, such as specimen size. For example, in some 
embodiments, each specimen is tested against all of the markers in the panel. Therefore, the 
number of markers must not be larger than the number of samples into which the specimen may 
be divided.. Another technical practicality is time. Typically, the library contains less than 60 
markers. Preferably, the library contains less than 50 markers. More preferably, the library 
contains less than 40 markers. Most preferably the library contains 10-30 markers. It is 
preferable that the library of potential panel members contain -more than 10 markers so that there 
is opportimity to optimize the perfprmance of the panel. As used herein, the term "about" means 
plus or minus 3 markers. 

[0074] In some embodiments, a library is obtained by consulting sources which contain 
information about various markers and correlations between the markers and generic/specific 
disease sJtates. Exemplary sources include experimental results, theoretical or predicted analyses 
and literary sources, such as journals, books, catalogues and web sites. These various sources 
may use histology or cytology and may rely on cytogenetics, such as in situ hybridization; 
proteomics, such as immunohistochemistry; cytometry, such as MACs or DNA ploidy ; and/or 
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cytopathology, such as morphology. The markers may be localized anywhere in or on a cell. For 
example, the markers may be localized in or on the nucleus, the cytoplasm or the cell membrane. 
The marker may also be localized in an organeille within any of the aforementioned localizations. 
[0075] In some embodiments, the library may be of an unsuitable size. Therefore, one or more 
pruning steps may be required prior to initiating the basic method for making a panel. The 
pruning step may involve one or several successive pruning steps. One pruning step may 
involve, for example, setting an arbitrary threshold for sensitivity and/or specificity. Therefore, 
any marker whose experimental or predicted sensitivity and/or specificity falls below the 
threshold may be removed from the library. Other exemplary pruiiing steps, which may be 
performed alone or in sequence with other pruning steps, may rely on detection technology 
requirements,- access constraints and irreproducibility of reported results. With respect to 
detection technology requirements, it is possible that the machinery required to detect a particular 
marker is unavailable. With respect to access constraints, it is possible that licensing restrictions 
make it difficult or impossible to obtain a probe that binds to a particular marker. In some . 
embodiments, a due diligence study is performed on each marker. 

[0076] In some embodiments, prior to beginning the basic method for making a panel, it may 
be necessary to perform preparation steps. Exemplary preparation steps include optimizing the 
protocols for objective quantitative detection of the markers in the library and collecting 
histology specimens. Optimization of the protocols for objective quantitative detection of the 
markers is within die skill of an ordinary artisan. For example, the necessary reagents and 
supplies must be obtained, such as buffers, reagents, software and equipment. It is possible that 
the concentration of reagents may need to be adjusted. For example, if non-specific binding is 
observed, a person of ordinary skill in the art may dilute the concentration of the probe solution. 
[0077] In some embodiments, the histology specimens are Gold Standards. The term "Gold 
Standard" is known by a person of ordinary skill in the art to mean that the. histology and clinical 
diagnosis of the specimen is known. The gold standards are often referred to as a "training" data 
set. The gold standards comprise a set of measurements, or reliable estimates, of all the features • 
that may contribute to the discriminating process. Such features are collected.from saniples 
collected from a representative number of patients with known disease states. The standard 
samples can be cytology samples but this is less desirable for panel selection. 
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[0078] The histology samples may be obtained by any technique known to those of skill in the. 
art, for example biopsy. In some embodiments, it is necessary that the size of the specimen per 
patient be large enough so that enough tissue sections can be obtained to test each marker in the 

library. 

[0079] In some embodiments, specimens are obtained from .multiple patients diagnosed with 
each specific disease state. One specimen per patient may be obtained, or multiple specimens per 
patient may be obtained. In embodiments in which multiple specimens are obtained from 
individual patients, the expertise of the surgeon is relied upon to establish that each specimen 
obtained from a single patient is similar to the other specimens obtained from that patient. 
Specimens are also obtained from a control group of patients. The control group of patients may 
be healthy patients or patients that are not suffering from the generic or specific disease state that 
is being tested. 

[0080] The first step of the basic method is determining the sensitivity and specificity of 
binding of probes to a Ubrary of markers associated with the desired disease state. In this step, a 
probe that is specific for each marker in the library is applied to a sample of the patients' 
specimens. Therefore, in some embodiments, if there are, for example, 30 markers in the library, 
each patient's specimen will be divided into 30 samples and each sample will be treated with a 
probe that is specific for one of the 30 markers. The probe contains a label that may be 
visualized. Therefore, the pattern and level of binding of the probe to the marker can be detected. 
The pattern and level of binding may be detected either quantitatively, i.e., by an analytical 
instrument, or qualitatively, by a human, such as a pathologist. 

(0081] In some embodiments, an objective and/or quantitative scoring method is developed to 
detect the pattern and level of binding of the probe to the markers. The scoring method may be 
heuristically designed. Scoring methods are used to objectify a subjective interpretation, for 
example, by a pathologist. It is within the skill of an ordinary artisan to determine a suitable 
scoring method. In some embodiments, the scoring method may comprise categorizing featui'es, 
such as the density of a marker probe stain as: none, weak, moderate, or intense. In another 
embodiment, these features may be measured with algorithms operating on microscope slide 
images. An. exemplary scoring method is one in which the proportions and density are 
consolidated into a single "H Score" obtained by grading the intensity as: none = 0; weak = I, 
moderate = 2, intense = 3, and the percentage cells as: 0-5% = 0, 6-25% = 1, 26-50% = 2, 51- 
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75% = 3. >75% = 4, and then multiplying the two grades together. For example, 50% weakly 
stained plus 50% moderately stained would score 6 = (1 x 2) + (2 x 2). The "H score" honors the 
late Kenneth Hirsch, one of the present inventors. 

[0082] Ah ordinary artisan is capable of addressing issues related to minimizing potential 
biases related to pathologists and sarnples. For example, randomizing may be used to minimize 
tlie chance of having a systematic error. Blinding may be used to eliminate experimental biases 
by the people conducting the experiments. For example, in some embodiments, pathologist-to- 
pathologist variation may be minimized by conducting a double blind study. As used herein, the 
term ''double blind study" is a well establish method for avoiding biases, where the data 
collection and data analysis are done independently. In other embodiments, sample-to-sample 
variation is minimized by randomizing the samples. For example, the samples are randomized 
before the pathologist analyzes them. There is also randomization involved in the experimental 
protocols. In some embodiments, each sample is analyzed by at least two pathologists. For each 
•patient, a reUable assessment of the binding of the probe to the marker is obtained. In one 
embodiment, this diagnosis is made by qualified pathologists, using two pathologists per patient, 
to check for reliability: 

[0083] A sufficient number of samples should be collected to produce rehable designs and 
reliable statistical performance estimates. It is within the skill of a normal artisan to determine 
how many samples are sufficient to produce reliable designs and reliable statistical performance 
estimates. Most standard classifier design packages have methods for determining the reliability 
of the. performance estimates and the sample size should be progressively increased until rehable 
estimates are achieved. For example, sufficient estimates to produce reliable designs may be 
achieved with 200 samples collected and 27 dijfferent features estimated from each sample. 
(0084) The second step is selecting a hmited plurality of probes. The selecting step may 
employ statistical analysis and/or pattern recognition techniques. In order to perform the 
selecting step, the data may be consolidated into a database. In some embodiments, tlie probes 
may be numbered to render their method of action as unseen during the analysis of their 
effectiveness and further minimize biases. Rigorous statistical techniques are used because of the 
large amount of data that is generated by this method. Any statistical method may be used and 
an ordinary skilled statistician will be able to identify which and how many methods are 
appropriate. 
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[0085] Any number of statistical analysis and/or pattern recognition methods may be 
employed. Since the structure of the data is initially unknown, and since different classifier 
design methods perform better for different structures, it is preferred to use at least two design 
metliods on the data. In some embodiments, three different methodologies may be used. One of 
ordinary sicill in the art of statistical analysis and/or pattern recognition. of data sets would 
recognize from characteristics of the" data set structures that certain statistical methods would be 
more likely to yield an efficient result than others, where efficient in this case means achieving a. . 
certain level of sensitivity and specificity with a desired number of probes. A person of ordinary 
skill in the art woiild know that the efficiency of the statistical analysis and/or method is data 
dependent. Exemplary statistical analysis and/or pattern recognition methods are described 
below: 

[0086] a) A Decisipn Tree Method, known as C4.5. C4.5 is public domain software available 
via ftp from http://vwwxse.unsw.edii.au/-quinlaTi/. . This is well suited to data that can be best 
classified by sequentially applying a decision threshold to specific features in turn. This works 
best with uncorrelated data; it also copes with data with similar means provided the variances 
differ. The C4.5 package was used to provide the examples shown herein, 
[0087] b) Linear Discriminant Analysis. This involves fmding weighted combinations of the 
features that give the best separation of the classes. These methods work well with correlated 
data, but not in data with similar means and different variances. Several statistical packages were 
used (SPSS, SAS and R), depending on the performance estimates and graphical outputs 
required. Fisher's linear discriminant function was used to obtain the classifier that minimized 
the error rate. A canonical discriminant function was used to compute receiver operating 
characteristic (ROC) curves showing the trade-off between sensitivity and selectivity as the 
decision threshold is changed. 

[0088] c) Logistic Regression; This is a non-linear transformation of the linear regression 
model: the dependent variable is replaced by a log odds ratio (logit). Linear regression, like 
discriminant analysis, belongs to a class of statistical methods founded on linear models. Such 
models are based on linear relationships between the explanatory variables. 
[0089] With a sufficient number of samples it is possible, using the above techniques ond 
software packages, to search for combinations of features giving good discrimination between 
the classes. Other exemplary statistical analysis and/or pattern recognition methods are the linear 
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Discriminant Function Method in SPSS and Logistic Regression Method in R and SAS. SPSS is 
the flill product name and is available from SPSS, Inc., located at SPSS, Inc. Headquarters,. 233 
S. Wacker Drive, Ilth floor, Chicago, Illinois 60606 (www.spss.com). SAS is the full product 
name and is available from SAS Institute, Inc., 100 SAS Campus Drive, Cary, NC 27513-2414,- 
USA (www.sas.com). R is tlie fiiU product name and is available as Free Software, under the 
terms of the Free Software Foundation's GNU (General Public License), http://www.r- 
pr6ject.org/. 

[0090] In some embodiments, a correlation matrix is obtained; Correlation measures the 
amount of linear association between a pair of variables. A correlation matrix is obtained by 
correlating the data obtained with one marker to data obtained with another marker. A threshold 
correlation number may be set, for example, 50% correlation. In this case, all markers with a 
correlation number of 50% or higher vvould be considered correlate markers. 
[0091 ] In some embodiments of the present invention, user supplied weighting factors may be 
used to obtain optimized panels. Weighting may be related to any factor. For example, certain 
. markers may be weighted higher than others due to cost, commercial considerations, 
misclassifications or error rates, prevalence of a generic disease state in a geographic location, 
prevalence of a specific disease state in a geogi'aphic location, redimdancy and availability of 
probes. Some factors related to cost that may encourage a user to weight certain markers higher 
than others is the cost of the probe and conunercial access issues, such as license terms and 
conditions. Some factors related to commercial considerations that may encourage a user to 
weight certain markers higher, than others are Research and Development (RtScD) time, R&D 
cost, R&D risk, i.e., the probability that the probe will work, cost of final analytical instrument, 
final performance and the time to market.. In a detection panel, for example, some factors related 
to misclassifications or error rates that may encourage a user to weight some markers higher than 
others is that it may be desirable to minimize false negatives. In a discrimination panel, on the 
other hand, it may be desirable to minimize false positives. Some factors related to prevalence of 
a generic or specific disease state in a geographic area.that may encourage a user to weight some: 
probes higher than others are that in some geographic locations the incidence of certain generic 
or specific diseases are more or less prevalent. With respect to redundancies, in some, instances it 
is desirable to have redundancies in the panel. For example, if for some reason one probe fails to 
. be detected, due to the biological variability of the markers in the panel, a disease state will still 
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be detected by the other markers. In some embodiments, markers that are preferred redundant 
markers may be weighted more heavily. 

[0092] The invention is tlexible in being adaptable to the availability of features where cost or 
supply problems may not allow the very best combination. In one embodiment, the invention can 
simply be applied to the available features, to find an alternative combination. In another 
embodiment, the algorithm is used to select features that allow cost weightings to be included in 
the selection process to arrive at a minimum cost solution. In the examples, marker performance 
estimates for combinations selected from all the markers collected or for only a group of 
commercially preferred probes are shown. The examples also demonstrate how the C4.5. 
package can be used to down weight certain probes on the basis of their high cost. These probe 
combhiations may not perform as weU as the optimum combination, but the performance might 
be acceptable in circumstances where cost is a significant factor. 
[0093] Some of the methods used allow weightings to be appUed to the classes. This is 
available in C4.5 where the tree design can optimize the cost. Also, the Discriminant Function 
method gives a single parameter output which can be used to give a desired false positive or false 
negative probabihty, A plot of these parameters for different tlireshold settings is known as the 
receiver operating characteristic (ROC) curve. An ROC curve shows the estimated percentage of 
false positive against tme positive scores for different threshold levels of a classifier. 
[0094] Given the heterogeneous nature of many generic disease states, the panels may be 
constructed with a degree of redundancy to ensure that the tests have sufficient sensitivity, 
specificity, positive predictive value (Positive Predictive Value = Tme Positives/(True Positives 
+. False Positives) and negative predictive value (Negative Predictive Value = True ' 
negatives/(False Negatives + True Negatives) to justify their use as a population-based screen. 
However, local and regional differences may dictate specific use of the tests in different 
segments of the global market, and so may significantly influence the criteria used to.construct 
the fmai panel test for a given market. While the optimization of clinical utility is of utmost 
importance, local factors including affordability (cost), technical competeiice, laboratory and 
healthcare provider resources, workflow issues, manpower requirements, and availability of the 
probes and labels will contribute to a final, local selection of the markers used in the panel. Well 
known hnear discriminant function analysis is used to include and assess all potential selection 
factors, by which each local factor is represented by a term in the equation, and each is weighted 
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according to its locally detemiined significance. In this way, a panel test optimized for use in 
one world region may differ from a panel test optimized for use in a different region. 
[0095] Once detection or discrimination panels have been diesigned using the above described 
method, the next step is to validate the panel using Icnown cytology samples. Prior to validation, 
optional optimization steps may be performed. In some embodiments, the method for collecting 
cytology samples may be improved. This encompasses methods of obtaining the sample from 
the patient as well as methods for mixing the cytology sample. In other embodiments, die 
cytology presentation methods may be improved. For example, identifying optimal fixatives 
(preservation fluids) or transportation fluids. 

[0096] The cytology samples used to validate the panels produced using the gold standard 
histology samples are cytology samples with known diagnoses. These saniples may be collected 
using any mediod Icnown by those of skill in the art. For example, sputum samples can be 
collected by spontaneous production, induced production and through the use of agents that 
enhance sputum production. The sample is contacted with each probe in the panel and the level 
and pattern of binding of the probes is analyzed to determine the performance of the panel. In 
some embodiments, it may be necessary to further optimize the panel. For example, it may be 
necessary to remove a probe from the panel. Or, it may be necessary to add an additional probe 
to the panel. Additionally, it may be necessary to replace one probe on the panel with another 
probe. If a new probe is added, this probe may be a correlate marker as determined from a 
correlation matrix. Alternatively, the probe may be a functionally similar marker. Once the 
panel is optimized, the panel may proceed for further testing in clinical studies. 
[0097] In other embodiments, it is not necessary to optimize the panel. If the results with the 
cytology samples correlate with the results from the histology samples, there may not be a need 
to optimize the panel and the panel may proceed for further testing in clinical studies. 
4. Methods of Use 

[0098] Once a panel is obtained using the above described method, it may be applied to 
cytologic samples. To illustrate the method, cancer, especially lung cancer, will be exemplified. 
Similar steps and procedures will be applied for other disease states. It is to be expected that 
cells shed from the specified lesions will stain in a similar fashion and show in a cytologic . 
sample*, such as a fine need aspiratiori, sputimi, urine, in a similar fashion as in the histologic 
pathology samples used to obtain the panel. . 



29 



wo 2004/025251 



PCT/US2003/028379 



[0099] The btisic method of the present invention typically involves two steps. First, a 
cytological sample suspected of containing diseased cells is contacted with a panel containing a 
plurality of agents, each of which quantitatively binds to a disease marker, Then, the level or 
patteni of binding of each agent to a disease marker is detected. The results of the detection- may 
be used to diagnose the presence of a generic disease or to discriminate among specific disease 
states. An optional preliminary step is identifying an optimized panel of agents that will aid in 
the detection of a disease or the discrimination between disease states in a cytologic sample. 
[0100] Cytology specimens may include, but are not limited to, cellular samples collected from 
body fluids, such as blood, urine, spinal fluids, and lymphatic systems; epitheUal cell-based 
organ systems, such as the pulmonary tract, e.g., lung sputum, urinary tract, e.g., bladder 
washings, genital tract, e.g., cervical Pap smears, and gastrointestinal tract, e.g., colonic 
washings; and fine needle aspirations from splid tissue sites m organs and systems such as the 
breast, pancreas, liver, kidneys, thyroid, bone marrow, muscles, prostate, and lungs; biopsies 
from solid tissue sites in organs and systems such as the breast, pancreas, hver, kidneys, thyroid, 
bone marrow, muscles, prostate, and lungs; and.histology specimens, such as tissue from surgical 
biopsies. 

[0101 1 An illustrative panel of agents according to the present invention includes any riuniber 
of agents that allows for accurate detection of malignant cells in a cytological sample. Molecular 
markers envisioned by the present .invention may be any molecule that aids in the detection of 
malignant cells. Markers may be selected for inclusion in a panel based on several different 
criteria relating to changes in level or pattern of expression of the marker. Preferred are 
molecular markers where changes in expression: occur early in tutnor progression, are exhibited 
by a majority of tumor cells, allow for detection of in excess of 75% of a given tumor type, most 
preferably in excess of 90% of a given tumor type and/or allow for the discrimination between . 
histologic types of cancer. 

[0102] The first step of the basic method is the detection of changes in the level or pattern of 
expression of the panel of agents in a cytological sample. This step typically involves contacting 
the cytologic sample with an agent, such as a labeled polyclonal or monoclonal antibody or 
fragment thereof or a nucleic acid probe, and observing the signal in individual cells. Detection 
of cells where diere is a change in signal is indicative of a change in the level of expression of the 
molecular marker to which the label probe is directed. The changes are based on an increase or 
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■ decrease in. the level of expression relative to nomnalignant cells obtained from the tissue or site, 
being examined. 

[0103] An analysis of the changes in the level or pattern of expression of a panel of agents 
• enables a sldlled artisan to determine, with high sensitivity and high specificity, whether 
malignant cells are present in the cytologic sample. The term "sensitivity" refers to the 
conditional probability that a person having a disease will be correctly identified by a clinical 
test, (the number of true positive results divided by the number of true positive and false negative 
results). Therefore, if a cancer detection method has high sensitivity, tlae percentage of cancers 
detected is high e.g., 80%, preferably greater than 90%. The.term "specificity" refers to the 
conditional probability that a person not having a disease will be correctly identified by a clinical 
test, (i.e., the number of true negative results divided by the number of true negative and false 
positive results) . Therefore, if a cancer detection method has high specificity, 80%, preferably 
90%, more preferably 95%, the percentage of false positives the method produces is low, A 
"cytologic sample" encompasses any sample collected from a patient that contains that patient^s 
cells. Examples of cytological samples envisioned by the present invention include body fluids, 
epithelial cell-based organ system washings, scrapings, brushings, smears or effusions, and fine- 
needle aspirates and biopsies. 

[0104] Use of the markers described in this invention assumes that it is possible to obtain an 
adequate cytologic sample routinely and that the samples can be adequately preserved for 
subsequent evaluation. The cytologic sample may be processed and stored in a suitable 
preservative. Preferably, the cytologic sample is collected in a vial containing the preservative. 
The preservative is any molecule or combination of molecules known to maintain cellular 
morphology and inhibit or block degradation of cellular proteins and nucleic acids. To ensiu-e 
proper fixation, the sample may be mixed at the collection site at high speeds to disaggregate the 
sample and/or break up obscuring material such as mucus, thereby exposing the cells to the 
preservative. 

[0105J Once a specimen is obtained, it is desirable to homogenize it, using an appropriate 
mixing device. This permits using aliquots for multiple purposes, including the possibility of 
sendmg ahquots to more than one testing site, as well as preparing multiple slides and/or multiple 
depositions on a slide. The initial homogenization of the specimen and of each ahquot before use 
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will ensure that each individual slide will have substantially the same distribution of cells, so that 
comparisons of results from one slide to another will be meaningful. 

[0106] Preparation of a specimen for analysis involves applying a sample to a microscope sUde 
.using metliods including, but not limited to,, smears, centrifugation, or deposition of a monolayer 
of cells. Such methods may be manual, semi-automated, or fully automated. The cell suspension 
may be aspirated depositing the cells on a filter and a monolayer of cells transferred to a prepared 
slide that may be processed for further evaluation. By repeating this process additional slides 
may be prepared as necessary. The present invention encompasses detection of one molecular 
. marker per slide. Detection of several molecular rnarkers per shde is also envisioned. 
Preferably, 1-6 markers are detected per slide. In some embodiments 2 markers are detected per 
slide. In other embodiments, 3 markers are detected per slide. 

(01 07] The present invention contemplates detecting changes in molecular marker expression 
at the DNA, RNA or protein level using any of a number of methods available to an ordinary 
skilled artisan. Detection of the changes in the level or pattern of expression of the molecular 
markers in a cytologic sample generally involves contacting a cytologic sample with a polyclonal 
or monoclonal antibody or fragment thereof or a nucleic acid sequence that is complementary to 
the nucleic acid sequence encoding a molecular marker in the panel, collectively "probes", and a 
label. Typically, the probe and label components are operatively linked so that when the probe ■ 
reacts with the molecular marker a signal is emitted (a "labeled probe"). Labels envisioned by 
the present invention are any labels that emit or enable a signal and allow for identification of a 
component in a sample. Preferred labels include radioactive, fluorogenic, chromogenic or 
. enzymatic moieties. Therefore, possible methods of deteCtiOti include, but are not limited to, 
immunocytochemistry; proteomics, such as imiiiunochemistry; cytogenetics, such as in situ 
hybridization, and fluorescence in situ hybridization; radiodetection, cytometry and field effects, 
such as MACs and DNA ploidy (tlie quantitation of stoichiometrically-stained nuclear DNA 
using automated computerized cytometry) and; cytopathology, such as. quantitative . 
cytopathology based on morphology. The signal generated by the labeled probe is preferably of 
sufficient intensity to permit detection by a medical practitioner or technician. 
[0108] Once the slide is prepared, a medical practitioner conducts a microscopic review of the 
slides in order to identify cells that exhibit a change in iharker expression characteristic of a 
diagnosis of cancer. The medical practitioner may use an image analysis system and automated 
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microscope to identify cells of interest. Analysis of the data may make use of an information 
management system and algorithms that will assist the physician in making a definitive diagnosis 
and select tlie optimal therapeutic approach. A medical practitioner may also examine the 
• sample using an instrument platfom that is capable of detecting the presence of the labeled 
agent. 

[0109] A molecular diagnostic panel assay will result in one or more glass microscope slides 
with labeled cells and/or tissue sections. The challenge for human experts to assess these 
(cyto)pathdlogy multilabeled-cell preparations objectively and with clinically raeaningful results 
is a virtually insurmountable detection and perception problem for any human being. 
[0110] Computer-aided imaging systems (i.e.. Photonic Microscopes™) can be developed and 
used to assess quantitatively and reproducibly the amount and location of probe-labeled cells and 
tissues. Such Photonic Microscopes'^'^ combine robotic slide-handling capabilities, data 
management systems (e.g., medical informatics), and quantitative digital (optical and elecbrbnic) 
image analysis hardware and software modules to detect and report cell-based probe content and 
localization data that cannot be obtained by human visualization with comparable sensitivity and 
accuracy. These probe data can be used to characterize and differentiate cellular samples based 
upon their related characteristics and differences in their respective cell-based markers for a 
variety oif disease states. 

[0111] The present methodology is a methodology whereby the molecular diagnostic panels are 
applied to cell-based specimens and samples, and whereby computer-aided imaging systems are . 
subsequently used to quantify and report the results of the molecular diagnostic panel tests. Such 
imaging systems can be used to evaluate cell-based samples in which multiple probes are used 
simidtaneously on a given slide-based sample, and in which the probes can be separately 
analyzed, quantified, and reported because the probes are differentiated by color on the 
microscope cytology or histology slide. 

[0112] The signals generated by a labeled agent in the sample may, if they are of appropriate 
type and of sufficient intensity, be detected by a human reviewer (e.g., pathologist) using a 
standard microscope or a Computer- Aided Microscope [167] . The Computer- Aided Microscope 
is an ergonomic, computer-interfaced microscope workstation that integrates mouse-driven 
control of microscope operation (e.g., stage movement, focusing) with computerized automation 
of key fimctions (e.g., slide scanning patterns). A centralized Data Management System stores, 
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organizes and displays relevant patient information as well as results from all specimen 
screenings and pathologist reviews. An identification number that is imprinted onto barcodes 
and affixed to each sample slide uniquely identifies each sample in the database, and relates it to 
the original specimen and the patient. 

[0113] In a preferred embodiment the signals generated by a labeled agent in the sample will be 
detected and quantitated using an automated image analysis system, or Photonic Microscope, 
interfaced to the centralized Data Management System. The Photonic Microscope provides fully 
automated software control of the microscope operations and incorporates detectors and other 
components appropriate for quantitation even of signals not detectable by human reviewers, such 
as very faint signals or signals from radiolabeled moieties. The location of detected signals is 
stored electronically for rapid relocation by automated instniments, and for human review using a 
Computer- Aided Microscope [168]. 

[01 14] The centralized Data Management System archives all patient and sample data using the 
bar-coded identification number. The data may be acquired asynchronously, from a multiplicity 
of sites, and may be derived from multiple reviews and analyses by human cyto legists and/or . 
automated analyzers. These data may include results frona multiple sample slides representing 
aUquots from a single previously homogenized patient specimen. Part or all of the data may be 
transferred to or from a hospital's Laboratory Information System to meet reporting, archiving, 
billing or regulatory requirements. A single, comprehensive report with integrated results from 
panel tests and human reviews may be generated and delivered to die physician in hardcopy, or 
electronically through networked computers or the Internet. 

[0115] In some embodiments, the inistant method allows for differential discrimination of 
different diseases, such as different histologic types of cancers. The tenn "histologic type'' refers 
to specific disease states. Depending on the general disease state there can be one or several 
histologic types. For example, lung cancer includes, but is not limited to, squamous cell . 
carcinoma, adenocarcinoma, large cell carcinoma, small cell carcinoma and mesotheUoraa. 
Knowledge of the histologic type of cancer affecting a patient is very useful because it helps the 
medical practitioner to localize and characterize the disease and to determine, the optical- 
treatment strategy. 

[0116] In order to determine the specific disease state, a panel of markers is selected that allows 
for discrimination between specific disease states. For example, within a panel of molecular 
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markers, a pattern of expression may be identified that is indicative of a particular histologic type 
of cancer. The detection of the.level of expression of the panel of molecular markers is achieved 
by the above-described methods. Preferably, a panel of 1-20 molecular markers is employed to 
discriminate among the various histologic types of lung cancer. However, most preferably, 4-7 • 
markers are used. Decision trees may be developed to aid in discriminating between different * 
histologic types based on patterns of marker expression. 

[01 17] In addition to allowing for the detection of malignant cells in a cytologic sample, the 
instant invention has utility in the molecular characterization of the disease state. Such 
infomation is often of prognostic significance and can assist the physician in the selection of the 
optimal therapeutic approach for a particular patient. In addition, the paiiel of markers described 
in this invention may have utility in monitoring the patient for either recurrence or to measure the 
efficacy of the therapy being used to treat the disease. 

[0118J By way of non-limiting example, the presence. of lung cancer may be detected by a lung 
cancer detection panel and the specific type of lung cancer may be detected by a discrimination 
panel. If the medical practitioner determines . that malignant cells are present in the cytologic 
sample, a flirther analysis of the histologic type of lung cancer may be performed. The histologic 
type of limg cancer encompassed by the present invention includes but is not limited to squamous 
cell carcinoma, adenocarcinoma, large cell carcinoma, small cell carcinoma and mesothelioma. 
Figure I illustrates molecular markers that are preferable markers to be included in a panel for 
identifying different histologic types of lung cancer. The column labeled indicates the 
percentage of tumor specimens that express a particular marker. 
[0119] In detennining the various histologic types of limg cancer, the relative level of 
expression of a marker is analyzed. Figure 2 illustrates how different markers may be iised to 
discriminate among different histologic types of cancer. In this table, SQ indicates squamous cell 
carcinoma, AD indicates adenocarcinoma, LC indicates large cell carcinoma, SC indicates small 
cell carcinoma and ME indicates mesothelioma. The numbers appearing in each cell represent . 
firequency of marker change in one cell type versus another. To be included in the table, the ratio! 
must be greater than 2.0 or less than 0.5. A niunber larger than 100 generally indicates that the 
second marker is not expressed. In such cases the denominator was set at 0.1 for the purpose of 
the analysis. Finally, empty cells represent either no difference in expression or, the absence of 
expression data. 
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[0120J One method for analyzing the data collected is to construct decision trees. Schemes- 1-4 
are examples of decision trees that may be constructed to enable a differential determination of a 
histologic type of lung cancer using the patterns of expression. The present invention is in no 
way limited to the decision trees presented in Schemes 1-4. The relative level of expression of a 
marker can be higher, lower, or the same (ND) as the level of expression of the molecular marker 
in a malignant cell of a different histologic type. Each scheme enables a distinction between five 
histologic types of lung cancer through the use of the indicated panel of molecular markers. 
[0121 J For example, m Scheme 1 the panel consists of HERA", MAGEo, Thrombomodulin and 
Cyclin Dl . First the sample is contacted with a labeled probe directed toward HERA. If the 
expression of HERA is lower than the control, the test indicates that the histologic type of lung 
cancer is mesothelioma (ME).: If, however, the expression is higher or the same as the control, 
the sample is contacted with a probe directed toward MAGE-3. If the expression of MAGE-3 is 
lower than the control, the sample is contacted with, a labeled probe directed toward Cyclin Dl 
and a determination of small cell carcinoma (SC) or adenocarcinoma (AD) is possible. If the 
expression of MAGEo is higher than or the same as the control, the sample is contacted with a 
labeled probe directed toward Thrombomodulin and a determination of squamous cell carcinoma 
(SC) or large cell carcinoma (LC) is possible. 



Scheme 1 




[0122] In Scheme 2 the panel consists of E-Cadherin, Pulmonary Surfactant B and . 
Thrombomodulin. First the sample is contacted with a labeled probe directed toward E- 
Cadherin. If the expression of E-Cadherin is lower than the control, the test indicates that the 
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liistologic type of lung cancer is mesothelioma (ME). If, however, the expression is higher or the 
same as the control, the sample is contacted with a probe directed toward Pulmonary Surfactant 
B. If the expression of Pulmonary Surfactant B is lower than the contrpl, the sample is contacted 
with a labeled probe directed toward Thrombomodulin and a detemiination of squamous cell 
carcinoma (SQ) or large cell carcinoma (LC) is.possible. If the expression of Pulmonary ■ 
Surfactant B is higher than or the same as the control, the sample is contacted with a labeled 
probe directed toward GD44v6 and a determination of adenocarcinoma (AD) and small cell 
carcinoma (SC) is possible. (See Schemes 3 and 4 for more examples of decision trees). 



Scheme 2 
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Scheme 3 




Scheme 4 




[0123] A preferred method involves using panels of molecular markers where differences in the 
pattern of expression permits the discrimination .between the various histologic type of lung 



cancer. 



[0124] Many different decision trees may be constructed to analyze the patterns of marker 
expression. This information may be used by physicians or other healthcare providers to make 
patient management decisions and select. an optimal treatment strategy. 
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5. ReportijQg of Results of Panel Analysis 
[0125] The. results from the panel analysis may be reported in several ways. For example, the 
results.may be reported as a simple "yes or no" result. Alternatively, the result may be reported 
.as a probability tliat the test results are.correct. For example, the rissults from a detection panel , 
study may indicate whether a patient has a generic disease state or not. As the panel also reports 
the specificity and sensitivity, the results may also be reported as the probability that the patient 
has' a generic disease state. The results from a discrimination panel analysis will discriminate 
among specific disease states. The results may be reported as a "y^s or no" with respect to 
whether the specific disease state is present. Alternatively, the results may be reported as a 
probability that a specific disease state is present. It is also possible to perform several 
discrimination panel analyses on a specimen from one patient and report a profile of the 
probabilities that the disease state present is a specific disease state with respect to the other 
possibilities. The other possibilities may also include false positives, 
[0126] In embodiments in which a profile of the probabilities of each specific disease state 
being present is produced, there are several possible outcomes. For example, it is possible that 
all of the probabilities will be a very small probability. In this instance, it is possible, that the 
doctor will conclude that the patient's specimen diagnosis is a false positive. It is also possible 
that all of the probabilities will be low except for one that is above 80-90%. In this instance, it is 
possible that the doctor will conclude that the test verifies that the patient has the specific disease 
state that indicated the high probability. It is also possible that most of the probabilities will be. 
low, but similarly high probabiUties are reported for two specific disease states. In this case, a 
doctor may recommend more extensive panel testing to ensure that the correct disease state is- 
identified. Anotlier possibility is that all of the probabilities reported will be low, with one being 
slightly higher than the rest but not high enough to be in tlie 80-90% range. In this case, a doctor 
may reconrntiend more extensive panel testing to ensure that the correct disease state is identified 
and/or tb rule out metastaticcancer from a remote primary tumor of a different cancer type, 
[0127] The following Example is illustrative of the method of the invention for selecting a 
disease detection panel, disease discrimination panels, validation of the panels and use of the 
panels in the clinic to screen for a disease and to discriminate among different subtypes of the 
disease. Lung cancer was selected for this illustrative example, in part because of its importance 
to world health, but it will be appreciated that similar procedures will apply to odier types of 
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cancer, as well as to infectious, degenerative and autoimmune diseases, according to the 
foregoing general, disclosure. 

ILLUSTRATIVE EXAMPLES . 
1. Lung Cancer 

[0128] The present method was used to develop lung. cancer detection panels as well as single 
lung cancer type specific discrimination panels. Lung cancer is an extremely complex collection, 
of diseases that can be segregated into two main classes. Non-small cell lung carcinoma 
(NSCLC) that accounts for approximately 70 . to 80% of all lung cancers can be further 
subdivided into three main histologic types including squamous cell carcinoma, adenocarcinoma, 
and large cell carcinoma. The remaining 20 to 30% of lung cancer patients present with small 
cell lung carcinoma (SCLC)i In addition, malignant mesothelioma of the pleural space, can 
develop in individuals exposed to asbestos and will often spread widely invading othei: thoracic 
structures. Different forms of lung cancer tend, to localize in different regions of the lung, have 
different .prognoses, and respond differently to various forms of therapy. 
[0129] According to the latest statistics from the World Health Organization (Globocan 2000), 
lung cancer has become the most commoii fatal malignancy in both men and women with an 
estimated 1.24 million new cases and 1.1 million deaths each year. In the U.S. alone, the 
National Cancer Institute reports that there are approximately 186,000 new cases of lung cancer 
and each year 162,000 people die of the disease, accounting for 25% of all cancer-related deaths. 
In the U.S., overall 1-year survival for patients with lung cancer is 40%, however, only 14% live 
5 years. In, otlier parts of the world, 5-year survival is significantly lower (5% in the UK). The 
. high mortality of lung cancer can be attributed to the fact that most patients (85%) are diagnosed 
with advanced disease when treatment options are Umited and the disease is likely to have 
metastasized, in these patients, 5-year survival is between 2-30% depending of the stage at the . 
time of diagnosis. This is in sharp contrast to cases where patients are diagnosed early and 5- 
year survival is greater than 75%. While it is true that a niunber of new chemotherapeutic agents 
have been introduced info clinical practice for the treatment of advanced lung cancer, to date, 
none have yielded a significant improvement in long-term survival. Even though patients with 
early stage disease can presumably be cured by surgery, they remain at significant risk, as there is 
a high probability that they will develop a second malignancy. Thus, for the lung cancer patient, 
early detection and treatment followed by aggressive monitoring provides the best chance of 
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achieving significant improvements in. long-term survival along with a reduction in morbidity 
and cost. 

[0130] At the present time, a patient is suspected of having lung cancer either because of a 
• suspicious lesion on X-ray or because the patient becomes symptomatic. As a result, most 
patients are diagnosed with relatively late stage disease. In addition, because most methods lack 
sufficient sensitivity with respect to the detection of early stage disease, the current poUcy of the. 
U.S. National Cancer Institiite (NCI), National Institutes, of Health, recommends against 
screening for lung cancer even in populations of patients who are at significant risk. In this 
embodiment of the present invention, however, sputum cytology is employed to provide a 
relatively noninvasive, more effective and cost-effective means for the early.detection of lung 
cancer. 

[0131] The specificity of sputum cytology is relatively high. Recent studies have indicated that 
experienced cytotechnologists are able to recognize malignant or severely, dysplastic cells with a 
high degree of accuracy and reliability [10]. .While the detection rate can be as high as 80 to 90% 
when samples are collected from patients with a relatively advanced disease [11,12], overall, 
sputum cytology has a sensitivity of only 30-40% [13,14]. The low sensitivity of sputum 
cytology is particularly important given that obtaining and preparing the specimen can be 
relatively expensive. Furthermore, failing to detect a malignancy can significantly delay 
treatment thereby reducing the chance of achieving a cure. 

[0132] The selection of an *'at-risk" population can also influence the value.of sputum cytology 
as a screening tool. Individuals who are at significant risk include those with a prior diagnosis of 
lung cancTsr, long-term smokers .or former smokers'(>30 padk years) and individuals with long- 
temi exposure to asbestos or pulmonary carcinogens. People with a genetic predisposition or 
familial history are also included in an "at-risk" population. Such individuals are likely to benefit 
from testing. While the inclusion of individuals with lower risk may result in an increase in the 
absolute number of cases detected, it would be hard to justify the substantial increase in 
healthcare costs. . . 

[0133] Other factors that contribute to the relatively poor performance of conventional sputum 
cytology include the location of the lesion, tumor size, histologic type, and the quality of the • 
sample. Squamous-cell carcinoma accounts for 31% of all primary putaionary, neoplasms. Most 
of these tumors arise from segmental bronchi and extend to the proximal lobar and distal 
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siibsegmental branches [15]. For this reason, sputum cytology is reasonably effective (79%) m 
detecting these lesions. Currently, squamous cell carcinoma is viewed as the only type of lung 
cancer that is amenable to cytologic detection in an in situ and radio logically occult stage [15], as 
sloughed cells are more likely to be available for evaluation. In one large study where patients 
were followed with both chest X-ray and sputum cytology, 23% of all lung cancers were detected 
by cytology alone, suggesting that the tumors were early stage and radiologically occult [16]. In 
another study [17], sputum cytology detected 76% of patients with radiologically occult tumors. 
. [0134] In the case of adenocarcinoma, 70% of tumors occur in the periphery of the lung 
making it less likely that maUgnant cells' will be found in a conventional sputiun specimen. For 
this reason, adenocarcinomas are rarely detected by sputum cytology (45%) [12,18,19], an 
important consideration, since the incidence of adenocarcinoma appears to be increasing, 
particularly in women [20-22]. 

[0135] Tumor size can also affect the lilcelihood of achieving a correct diagnosis, a factor that 
is particularly important when considering a screening test for the detection of disease in 
asymptomatic individuals. While there is only a 50% chance that tumors <24mm will be read as 
a true positive, the probability of. detecting a larger lesion is in excess of 84% [12]. 
. [0136] Recent reports also indicate that the celluiarity of the specimen will affect the sensitivity 
of sputum cytology [14,23]. In general, patients with squamous cell carcinoma produce 
specimens with significant numbers of tumor cells, thereby increasing the likelihood of a correct 
diagnosis [14,23]. For patients with adenocarcinoma, the presence of tumor cells in a sputum 
specimen is reported- to be less than 10% in 95% of the specimens and less than 2% in 75% of 
specimens, making the diagnosis significantly more difficult. 

[0137] The degree of differentiation can also influence the ability of a pathologist to detect 
malignant cells, particularly in cases of adenocarcinoma. Well-differentiated tumor cells 
frisquently resemble nonneoplastic respiratory epithelial cells. In the case of small-cell lung 
carcinoma, sputum samples often contain nests of loosely aggregated cells that have a distinct 
appearance. However, techniques currently used to process sputum samples tend to disaggregate 
the cells, making a diagnosis more difficult. 

[0138] Sample quality is another factor that can contribute to the low sensitivity of sputum 
cytology. Recent reports suggest that it is possible to obtain adequate samples from 70-85% of 
subjects. However, achieving this measure of success often requires that patients provide 
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multiple specimens [13]. This procedure is inconvenient, time-consuming and costly. Patient *. 
compliance, is also generally low, as patients are frequently asked to collect over several days 
[13]. Of equal importance is the observation that former smokers, while at significant risk for 
developing lung cancer, often, fail to produce an adequate specimen. Sample preservation and 
processing is another critical factor that can affect the value of sputum cytology as a diagnostic 
test. 

[0139] Lastly, even if adequate samples could be obtained and optimally prepared, 
cytotechnologists generally still have to review 2-4 slides per specimen, each typically taking up 
to four minutes [24], Given the low sensitivity, high technical complexity and labor intensity of 
conventional sputum cytology, it is not surprising that this test has been almost universally 
rejected as a population-based screen for the early detection of lung cancer [25]. 
[0140] Even if these technical issues were resolved, the low sensitivity of sputum cytology 
remains a significant problem. The high incidence of false negative results can significantly 
delay the patient receiving potentially curative therapy. While it may be possible to develop tests 
with greater sensitivity, such improvements must not come at the cost of specificity. An increase 
in the number of false positive results v/ould subject patients to unnecessary, often invasive and 
costly, follow-up and would have a negative impact on the patient's quaUty of life. The present 
invention overcomes many of the limitations associated with previous rnethods of early cancer 
detection, including those related to the use of sputum cytology for die early detection of lung 
cancer. 

[0141] Lung cancer is a heterogexieous collection of diseases. To. ensure, that a test has the 
necessary level of sensitivity and specificity to justify its use as a population based screen, the 
present invention envisions using, for example, a library of 10 to 30 cellular markers to develop 
panels. Selection of the library of this invention was based on a review and reanalysis of the 
relevant scientific literature where, in most cases, marker expression was measured in biopsy 
specimens taken from patients with lung cancer in an attempt to link expression with prognosis. 
[0142] For example, a preferred panel for early detection, characterization, and/or monitoring 
of lung cancer in a patient's sputum may include molecular markers for which.a change in 
expression occurred in at least 75% of tumor specimens. An exemplary panel includes markers 
selected from VEGF, Tluombomodulin, CD44v6, SP-A, Rb, E-Cadherin, cyclin A, nm23, 
telomerase, Ki-57, cyclin Dl, PCNA, MAGE-1, Mucin, SP-B, HERA, FGF-2, C-MET, thyroid 



43 



wo 2004/025251 



PCT/US2003/028379 



transcription factor, Bcl-2, N-Cadherin. EGFR, GluM, ER-related (p29), MAGEO and Glut-3. A 
most preferred panel includes molecular markers for which a change in expression occurs in 
' more than 85% of tumor specimens. An exemplary panel includes molecular markers selected 
■ from Glutl, HERA, Muc-1, Telomerase, VEGF, HGF, FGF, E-cadherin, Cyclin A, EGF • 
Receptor, Bcl-2, Cyclin Dl and N-cadherin. With the exception of Rb, and E-cadherin, a 
diagnosis of lung cancer is associated with an increase in marker expression. A brief description 
of the library of probes/markers utilized in the present example is provided below in Table 4. It 
is noted that the numbering of tlie antibodies in the table below is consistent with the number of 
the antibodies/probes/markers throughout this example. 
Table 4: 



Probes and Markers for Lung Panel 


.1 : : 


No. 


(Marker Abbreviation 


Full Name of Antibody Probe 


Target Marker Name/Oescriptlon 


1 


VEGF 


anli-VEGF 


Vascular Sndomelial Growm Factor protein 


2 


iTtvombomoduBn 


atlil-ThromDomoOulln 


trams-membrane glyccpralein 


3 


C044V6 


antl-C044v6 


cell surface glycoprotein (C044 vananl 6 gensK eel achasion molecule 


4 


|sp-A 


anti-Surfactani Apoproiein A 


pulmonary surfactant apoprotein 


5 


P^tinooiasioma 


anO-Retinooiastoma gene proauct 


phospnoprolein 


S 


lE-Cacmertn 


antU£>Cadherln 


transmembrane Ca** depandert ceil acrieslon molecule 


7 


1 Cyclin A 


antl-CyclIn A 


protein subunit of cyciln-dependent kinase enzymes: (or ceil cycle regutali 


8 Infn23 


anii-nm23 


12 closely relateo proteins produced by nm23-Hl and -HZ genes 


9 


Telomerase 


anif-Teiomerase 


Inbonuciaoprotein enzyme for cnromosome repair 


10 


Mib-1 (Ki-67) 


anii-Ki-67 


Inudear protein; expressed in proliferating cells 


11 


jcydin 01 


anti-Cyclin Dl 


tproiem subunit of cydin-deoendent Kinase enzymes: for cell cyda regulati 


12 


PCMA 




[protein cofacior for DNA polytnarasa delta 


13 


MAGE-1 


antt-Melanoma-Assodated Aniioen i 


cell recognition protein cooed oy MAGE famtty of genes 


14 


Mucin 1 (MUC-1) 


anii-Mucin t 


cell surface and secreted mucin (highly ^lycosylalad proleinl 


IS SP-B 


antKnature Surfactant Apoprotein fi 


(pulmonary surfactant apoproiein 


IB 


HERA . 


anti-Human Epttheiiat Related Antigen (MOC-SI) 


Iced surface antigen (transmemorane pnateln) 


17 


|fGF.2 (basic FGF) 


anll-Fibniota&t Grawin Factor 


proiein that binds to call surface 


18 


jc-MET 


anu^-MET . 


irans^emDrano receptor protein far hepaiocyte Growth Factor (HGFl 


19 


Thyroid Transcriouon Factor 1 


anii-TTF.| 


regulator or thyraid-speafic qenas; also expressed In lung 


20 


acL.2 


anti'aCL2 


tiniracellular memorane-bound protein encoded by 8CL2 gene 


21 


|P120 


antj-pl20 


{Protiferation-Assodated Nucleolar Antigen oroteln 


22 


IN-Caotierin 


anli-N-Cadherin 


jtransmembrane Ca** dependent ceO adhesion molecule • 


23 


6GFR 


anti-EGFR 


lepioermai Growth Factor Receotor; Iransmembrane glyooproteln 


24 


Glut 1 


antl-Giui 1 


iGIucose-transponing. iransmemorene Glul ianviv of proteins 


25 


ER-related (p29) 


anti-cR-relaied P29: antl-HSP 27 


Estrogen Receptor-relateo p2g oratein: Heat Shock protein 27 


26 


Mage 3 


antr-Mdanoma'Assooaled Antigen 3 


Iceli recognition pfbtein coded by MAGE famdy of genes 


27 


Glut 3 


anii-Glui 3 


|Glticose<iransoortjng. iransmemorane Ghil famityof pratems 


28 


IPCNA (hiQperolluiion) 


anti-PFOltferaiinq Cea Nuclear Anugen 


1 orotetn cof actor for OIMa poiymetasa dalla 



[0143] Each molecular marker in the preferred panel is described below. Table 5. reciting the 
percentage of expression of the markers in tissue for each type of lung cancer is provided at the 
end of this section. 

Glucose Transporter Proteins (Glut 1 and Glut 3) [26-28] 
(0144] Glucose Transporter-1 (Glut 1) and Glucose Transportero (Glut-3) are a ubiquitously 
expressed high affinity glucose transporter. Tumor cells often display higher rates of respiration, 
glucose uptake, and glucose metabolism than do normal cells, and the elevated uptake of glucose 
in tumor cells is thought to be mediated by glucose transporters. Overexpression of certain types 
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of GLUT isofonns has been reported in lung cancer. The cellular localization of Glut 1 is in tlie 
cell membrane. GLUT-1 and GLUT -3 are disease markers usefiil for detection of a disease state. 
[0145] Malignant cells exhibit an increase in glucose uptake that appears to be mediated by a 
family of glucose transporter proteins (Gluts). Oncogenes and growth factors appear to regulate . 
the expression of these proteins as well as their activities. Members of the Glut family of 
proteins exhibit different patterns of distribution in various human tissues and rapid proliferation 
is often associated with their overexpression. Recent evidence suggests that Glutl is expressed 
by a large percentage of NSCLC and by a majority of SCLC. 

[0146] • While the expression of Glut 3 is relatively low in both NSCLC and SCLC a significant 
percentage (39.5%) of large cell carcinomas, express the protein. In stage I tumors, 83% express 
Glutl at some level with 75-100% of cells staining in 25% of cases. These datia would suggest 
that Glutl overexpression is a relatively early event in tumor progression. Glutl 
imraunoreactivity has also been detected in > 90% of stage II and IIIA cancers. There also 
appears to be an inverse correlation between Glutl and Glut3 immunoreactivity and tumor . 
differentiation. Tumors expressing high levels of Glutl appear to be particularly aggressive that 
are associate with a poor prognosis. In cases were tumors v/ere negative for the piroteins better 
survival was observed. 

Human Epithelial Related Antigen (HERA) [29,30] 
10147] HERA is a transmembrane glycoprotein with an, as yet, imlcnown fimction. HERA is 
present on most normal and malignant epithelia. Recent reports suggest that the while HERA ■ 
expression is high in all histologic types of NSCLC making it useful as a detection marker. In 
contrast HERA expression is absent in mesothelioma and thus suggesting would have utility as a 
discrimination marker. The cellular localization of HERA is the cell surface, 

Basic Fibroblast Growth Factor (FGF) [3 1 -34] 
[0148] Basic Fibroblast Growth Factor (FGF) is a polypeptide growth factor with a high 
affinity for heparin and other glycosaminoglycans. In cancer, FGF functions as a potent mitogen, 
plays a role in angiogenesis, differentiation, and proliferation, and is involved in tumor 
progression and.metastasis: FGF overexpression frequently occurs in both SCLC and squamous 
cell carcinoma. In. many cases (62%), the cells also express the FGF receptor suggesting the . 
presence of an autocrine loop. Forty-eight percent of Stage I tumors overexpress FGF. Tlie 
frequency of FGF in Stage II lung cancer is 84%. Expression of either the growth factor or its 
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receptor was associated with the poor prognosis. Five-year survival rates for those patients with 
stage I disease were 73% for those expressing FGF versus 80% for those who were FGF 
negative. The cellular localization is tiie cell membrane. . 

Telomerase [35-42] . ' 
[0149J Teionierase is a ribonucleoprotein enzyme that extends and maintains telomeres of 
, eulcaryotic chromosomes. It consists of a catalytic protein . subunit with reverse transcriptase 
activity and an RNA subunit with reverse transcriptase activity and an RNA subunit that serves 
as the template for telomere extension. Cells. that do not express telomerase have successively- 
shortened telomeres with each cell division, which ultimately leads to chromosomal instability, 
aging and cell death. The cellular localization of telomerase is nuclear. 
[0150] Expression of telomerase appears to occur in immortalized cells and enzyme activity is 
a common feature of the malignant phenotype. Approximately 80-94% of lung tumors exhibit 
high levels of telomerase activity. In addition, 71% of hyperplasia, 80% of metaplasia, and 82% 
of dysplasia express enzyme activity. AH the carcinoma in situ (CIS) specimens exhibit enzyme 
activity. The low levels of expression in premahgnant tissues is probably related to the fact that 
only a small percentage of cells (5 and 20%) in the sample express enzyme activity. This is in 
contrast to tumors where 20-60%. of cells may express eiizyme activity. Based on a Umited 
number of samples it would appear expression of telomerase activity is also common in SCLC. 

Proliferating Cell Nuclear Antigen (PCNA) [43-5 1] 
[0151] PCNA fimctions ais a cofactor for DNA polymerase delta. PCNA is expressed in both S 
phase of the cell cycle and during periods of DNA synthesis associated with DNA repair. PCNA 
is expressed in proliferating cells in a wide range of normal and malignant tissues. The cellular 
localization of PCNA is liuclear. 

[0152] Expression of PCNA is a common feature of rapidly dividing cells and is detected in 
98% of tumors.- Immunohistochemical staining is nuclear with moderate to intense staining 
detected in 83% of NSCLC. Intense PCNA staining was observed in 51% of p53-negative 
tumors. However, when both PCNA (> 50% of cells staining) and p53 are overexpressed (>10% 
of cells stained) the prognosis tends to be poorer with a shorter time to progression. Although 
frequently detected in all stages of lung cancer, intense staining for PCNA is more common in 
metastatic disease. Thirty-one percent of CIS also overexpress PCNA. 
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CD44[51o8] 

[0153] CD44v6 is a cell surface glycoprotein that acts as a cellular adhesion molecule. It.is 
expressed on a wide range of normal and malignant cells in epithelial, mesothelial and 

: hematopoietic tissues. The expression of specific CD44 splice variants has been shown to be 
associated with metastasis and poor prognosis in certain human malignancies. It is expected to 

. be used for detection and discrimination between squamous cell carcinoma and adenocarcinoma.. 
CD44 is a cell adhesion molecule that appears to play a role in tumor invasion and metastasis. 
Alternative splicing results in the expression of several variant iso forms. CD44. expression is 
generally lacking in SCLC and is variably expressed in NSCLC. Highest levels of expression 
occur in squamous cell carcinoma, thus making it valuable in discriminating between tumor 
types. In non-neoplastic tissue, CD44 staining is observed in bronchial epithelial cells, 
macrophages, lymphocytes, and alveolar pneumocytes. There was no significant correlation 
between CD44 expression and tumor stage, recurrence, or survival particularly when 
overexpression occurs in early stage disease. In metastatic lesions 100% of squamous cell 
carcinoma and 75% of adenocarcinoma shov/ed strong CD44v6 positivity. these data would 
tend to indicate that changes in CD44 expresision occur relatively late in tumor progression that 
could limit its value as an early detection marker. Recent findings suggest that the CD44v8-10 
variant is expressed by a majority of NSCLC making it a possible candidate marker. 
Cyclin A [59-62] 

[01 54] Cyclin A is a regulatory subunit of the cyclin-dependent kinases (CDK's) which control 
the transition points at specific phases of the cell cycle. It is detectable in S phase and during 
progression into G2 phase. The cellular localization of Cyclin A is nuclear, 
[01531 Protein complexes consisting of cyclins and cyclin-dependent kinases flmction to 
regulate cell cycle progression. Changes in cyclin expression are associated with genetic 
alterations affecting the CCDNl gene. While the cyclins act as regulatory molecules, the cyclin- 
dependent kinases function as catalytic subunits activating and inactivating Rb. 
[01 56] Inimunohistochemical analysis has revealed that the overexpression of the cyclins is 
associated with an increase in cellular proliferation as indicated by a high Ki-67 labeling index. 
Cyclin overexpression occurs in 75% of NSCLC and appears to occur relatively early in Uunor 
progression. Recent reports indicate that 66.7% of stage I/II and 70.9% of stage III tumors 
overexpress Cyclin A. Nuclear staining is conmion in poorly differentiated tumors. Expression 
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of cyclin A is often associated with a decrease in mean survival time and a tendency towards the 
development of drug resistance. However, increased expression has also been associated with a 
greater response to doxorubicin. 

Cyclin Dl [63-73] 

[0157] Cyclin Dl , as witli Cylcin A, is a regulatory subunit of the cyclin-dependent kinases 
(CDK's) which control the transition points at specific phases. of the cell cycle. Cyclin Dl 
regulates the entry of cells into S phase of the cell cycle. This gene is frequently amplified 
and/or its expression deregulated in a wide range of human malignancies. The cellular 
localization of Cyclin Dl is nuclear. 

[0158] Like Cyclin A, cyclin Dl functions to regulate cell cycle progression. Staining of cyclin 
Dl is predominately cytoplasmic and independent of histologic type. Reports suggest that cyclin 
. Dl overexpression occurs in 40-70% of NSCLC and 80% of SCLC. Cyclin Dl, staining vyas 
observed in 37.9% of stage I, 60% stage II, and 57.9% of stage III tumors. Cyclin Dl expression 
has also been seen in dysplastic and hyperplastic tissue providing evidence that these changes 
occur relatively isarly in tumor progression. Patients who overexpress cyclin D I exhibit shorter 
mean survival time and lower five-year survival rate. 

Hepatocyte Growth Factor Receptor (C-MET) [74-77] 
[0159] C-MET is a proto-oncogene that encodes a transmembrane receptor tyrosine kinase for 
HGF. HGF is a mitogen for hepatocytes and endothelial cells, and exerts pleitrophic activity on 
several cell types of epithelial origin. The cellular localization of C-MET is the cell surface. 
[0160] Hepatocyte growth factor/scatter factor (HGF/SF) stimulates a broad spectrum of 
epithelial cells causing them to proliferate, migrate, and cany out complex differentiation 
programs including angiogenesis. HGF/SF binds to a receptor encoded by the c-MET oncogene. 
While both normal and malignant tissues express the HGF receptor, expression of HGF/SF 
appears to be limited to mahgnant tissue. 

[0161] While the human lung generally expresses low levels of HGF/SF, expression increases 
markedly in NSCLC. Using Western blot analysis, 88.5% of lung cancers exhibited an increase 
in the protein expression. All histologic types of tumors expressed the protein at increased 
concentrations. While increased levels of protein occur in all stages of the disease, recent 
evidence suggests that in addition to the cancer cells, stromal cells and/or inflammatory cells may 
be responsible for the production of the growth factor. 
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Mucin -^IUC-1 [78-82] 
[0162] Mucin- 1 comes from a family of highly glycosylated secretory proteins which comprise 
the major protein constituents of the mucous gel which coats £md protects the tracheobronchial 
tree, gastrointestinal tract and genitourinary tract. Mucin-1 is atypically expressed in epithelial 
tumors. The cellular localization of Mucin-1 is cytoplasm and the cell surface. 
[01 63 ] Mucins ate a family of high molecular weight glycoproteins that are synthesized by a 
variety of isecretory epithelial cells that are either membrane bound or secreted. Within the 
respiratory tract, these proteins contribute to the mucus gel that coats and protects that 
tracheobronchial tree. Changes in mucin expression commonly occur in conjunction with 
malignant transformation including lun'g cancer. Evidence exists- suggesting .at these changes may 
contribute to alterations in cell growth regulation, recognition by the immune system, and the 
metastatic potential of the tumor. 

[0164] Although normal lung tissue expresses MUC-1 , significantly higher levels of expression 
are found in lung cancer with highest levels occurring in adenocarcinoma. Staining appears to 
occur independently of stage and is more common in smokers than in former smokers or 
nonsmokers. Some premalignant lesions also exhibit increased MUC-1 expression. 

Thyroid Transcription Factor-1 (TTF-1) [83,84] 
[0165] TTF-1 belongs to a family of homeodomain transcription factors that activate thyroid- 
specific and pulmonary-specific differentiatiori genes. The cellular localization of TTF-1 is 
nuclear. 

[0166] TTF;1 is a protein originally foimd to mediate the transcription of thyroglobulin. 
Recently, TTF-1 expression was also foimd in the diencephalon and brohchioloalveolar 
epithelium. Within the lung TTF-1 fimctions as a transcription factor regulating the synthesis of 
surfactant proteins and ciara secretory protein. Overexpression of TTF-1 occurs in a large 
proportion of lung adenocarcinomas and can aid in distinguishing between primary lung cancer 
and cancers that metastasize to the lung. Adenocarcinomas that express TTF-1 and are 
cytokeratin 7 positive and cytokeratin 20 negative can be detected with 95% sensitivity. 
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Vascular Endothelial Growth Factor (VEGF) [33,6 1,85-89] 
[01 67] VEGF plays an important role in angiogenesis, which promotes tumor progression and 
metastasis. There are multiple forms of VEGF; the two smaller isofoims are secreted proteins 
and act as difflisible agents, whereas the larger two remain cell associated. The cellular 
localization of VEGF is cytoplasmic, cell surface, and exti*acellular matrix. 
[0168] Vascular Endothelial Growth Factor (VEGF) is an important angiogenesis factor and 
endothelial cell-specific mitogen. Angiogenesis is an important process in the latter stages of 
carcinogenesis, tumor progression and is particularly important in the development of distant, 
metastasis. VEGF binds to a specific receptor Fk that is often present in the tumors expressing 
the growth factor suggesting the presence of an autocrine loop. 

[0169] Inmiunohistochemical analysis reveals tliat cells expressing VEGF exhibit a pattern of 
staining that is diffuse and cytoplasmic. While not expressed by nonneoplastic cells, VEGF is 
present in the rnajority of NSCLC and in a smaller percentage of SCLC. Several reports have 
.shown high levels of VEGF in early stage lung cancer. 

(01701 Expression of VEGF has been associated with an mcreaised frequency of metastasis. 
Studies have shown that VEGF expression is indicative of a poor prognosis and shorter disease- 
free interval in adenocarcinoma but not in squamous cell carcinoma. Three year and five year 
survival rates in the group expressing high levels of VEGF were 50% and 16.7% as compared to 
90.9 and 77.9% respectively for the low VEGF group. 

Epidermal Growth Factor Receptor (EGFR) [90- 1 04] 
[0171] Epidermal Growth Factor Receptor (EGFR) is a transmembrane glycoprotein, which 
can bind and become activated by various ligands. Binding initiates a chain of events that result 
in DNA synthesis, cell proliferation, and cell differentiation. EGFR has been demonstrated in a 
broad spectrum of normal tissues, and EGFR overexpression is found in a variety of neoplasms. , 
Increased expression has been observed in adenocarcinomas of the limg and large cell 
carcinomas .but not in small cell lung carcinomas. The cellular localization of EGFR is the cell 
surface. 

[0172] The EGFR plays an important role in cell growth and differentiation. The EGFR is 
unifonnly present in the basal cell layer but not in more the superficial layers of histologically 
normal bronchial epithelium. With this exception, there is no consistent staining of normal 
tissue. Recent evidence suggests that the overexpressipn of the EGF receptor may riot be an 
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absolute requirement for the development of invasive lung cancer. However, it appear that in 
cases where EGFR overexpression occurs it is a relatively early event with greater staining 
intensity in' more advanced disease. 

[0173 J For patients with invasive carcinomas, 50-77% of tumors stain for EGF. 
Overexpression of the EGFR is more common in squamous cell carcinoma than in 
adenocarcinoma and common in SCLC. Highest levels of EGFR occur in conjunction with late 
stage and metastatic disease that have approximately twice the concentration of EGFR as that ' 
seen in stage I/II tumors. Estimates suggest that the level of the EGFR observed in stage . I 
tumors is approximately twice that seen in normal tissue. In addition, 48% of bronchial lesions 
also show EGFR staining including, metaplasia, atypia, dysplasia, and CIS. In the "normal" 
bronchial mucosa, of these same cancer patients, overexpression of the EGFR was observed m 
39% of cases but was absent in the bronchial epithelium of the non-cancer. In addition, 
overexpression of the EGFR occurs more frequently in the tumors of smokers than in 
nonsmokers, particularly in the case of squamous cell carcinoma. 

[0174] While several studies have suggested that overexpression of the EGFR is associated 
with the poor prognosis, other studies have failed to make this correlation. 

Nucleoside Diphosphate Kinase/nm23 [105-111] 
[01751 Nucleoside diphosphate kinase (NDP kinase)/nm23 is a nucleoside diphosphate Icinase. 
Tumor cells with high metastatic potential often lack or express only a low amount of nm2j 
protein, hence the nm23 protein has been described as a metastasis suppressor protein. The 
cellular localization of nm23 is nuclear and cytoplasmic. 

[0176] . Expression of nm23/nucleoside diphosphate/kinase A (nm23) is a marker of tumor 
progression where there is an inverse relationship between expression and metastatic potential. 
In cases where stage I tumors overexpress nm23, no evidence of metastasis was seen during an 
average follow-up period of 35 months. Immunohistochemical analysis reveals staining that is 
diffuse, cytoplasmic and generally limited to malignant cells. Alveolar macrophages also express 
the protein. Given that high levels of expression are associated, with a low metastatic potential, 
there is currently no explanation as to why normal epithelial cells do not express nm23. 
[0177] Intense staining has been observed in high percentage of NSCLC particularly large cell 
lung cancer and 74% of SCLC suggesting that this protein plays an important role in tumor 
progression. With the exception of squamous cell carcinoma, staining intensity tends to increase 
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with stage. Based on the available evidence, it would appear that nm23 is a prognostic factor in 
both SCLC and NSCLC. 

Bcl-2 [101,112-125] 

■ [0178] Bcl-2 is a mitochondrial membrane protein that plays a central role in the inhibition of 
apoptosis. Overexpression of bcl-2 is a common feature of cells in which programmed, cell death 
has been arrested. The cellular localization of Bcl-2 is the cell surface. 
[0179] Bcl-2 is a protooncogene believed to play a role in promoting the terminal 
differentiation of cells,. prolonging the survival- of non-cycling cells and blocking apoptosis in 
cycling cells.. Bcl-2 can exist as a homodimers or can form a heterodimer with Bax. As a 
homodimer, Bax functions to induce apoptosis. However, the formation of a Bax -bcl-2 complex 
blocks apoptosis. By blocking apoptosis, bcl-2 expression appears to confer a survival advantage 
upon affected cells. Bcl-2 expression may also play a role in the development of drug resistance. 
The expression of bcl-2 is negatively regulated by p53. . 

.[0180] Inamunohistochemistry analysis of bcl-2 reveals a heterogeneous pattern of cytoplasmic 
staining. In adenocarcinoma, expression of bcl-2 was significantly associated with smaller 
tumors (<2 .cm) and lower proliferative activity. The expression of bcl-2 appears to be more 
closely associated with neuroendocrine differentiation and occiu^ in a large percentage of SCLC. 
[0181] Overexpression of bcl-2 is not present in preneoplastic lesions suggesting that changes 
in bcl-2 occur relatively late in tumor progression. In addition to tumor cells, bcl-2 
immunostaining also occurs in basal cells and on the luminal surfaces of normal bronchioles but 
is generally not detected in more differentiated cell types. 

[0182] Association of bcl-2 immunoreactivity with improved prognosis in NSCLC is 
controversial. Several reports of suggested that patients with tumors expressing bcl-2 have a 
superior prognosis and a longer time to recurrence. Several reports indicate that bcl-2 expression 
tends to be lower in those patients who develop metastatic disease: For patients with squamous 
cell carcinoma, expression of bcl-2 has been linked to an improvement in 5-year survival. 
•However, in three relatively large studies there was no survival benefit linked to bcl-2 
expression, particularly for patients with early stage disease. 
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Estroge-n Receptor-related Protein (p29) [126] 
[0183] ER related protein p29 is an estrogen-related heat shock protein that has been found to 
correlate with the.expression of estrogen-receptor. The cellular localization of p29 is 

: cytoplasmic. - 

[0184] Estrogen-dependent intracellular processes are important in the growth regulation of 
normal tissue and may play a role in the regulation of malignancies. In one study expression of 
p29 was detected in 109 (98%) of 11 Hung cancers. The relation between p29 expression and 
survival time was different for men and women. Expression of p29 was associated with poorer 
survival particularly in women with Stage I aind II disease. There was no correlation between 
p29 expression and long-term survival in men. 

Retinoblastoma Gene Product (Rb) [68,73,123,127-141] 
[0185] Retinoblastoma Gene Product (Rb) is a nuclear DNA-binding phosphoprotein. Under 
phosphorylated Rb binds oncoproteins of DNA tumor viruses and gene regulatory proteins thus 
inhibiting DNA replication. Rb protein may act by regulating transcription; loss of Rb function 
leads to uncontrolled cell growth. Tbe cellular localization of Rb is nuclear. 
(01 86] Retinoblastonia protein (pRb)' is a protein that is encoded by the retinoblastoma gene 
and is phosphorylated and dephosphorylated in a cell cycle dependent manner. pRb is 
considered an important tumor suppressor gene that functions to regulate the cell cycle at GO/Gl. 
In its hypophosphorylated state, pRb inhibits the transition from 01 to S. During 01, 
inactivation of the growth suppressive properties of pRb occurs when the cyclin dependent 
kinases (CDK's) phosphorylate the protein. The hyperphosphorylation of pRb prevents it from 
forming a complex vyith E2F that functions as a transcription factor proteins that are required for 
DNA synthesis. 

[0187] Inactivation of .the retinoblastoma (Rb) gene has been documented in various types of 
cancer, including lung cancer. Small-cell carcinomas fail to stain for pRb indicating loss of Rb 
function. Overall, 17.6% of the tumors fail to express pRb with no correlation being seen with 
respect to stage or nodal status. A reduction in staining has also seen in 31% dysplastic bronchial 
biopsies. However, there appears to be no. correlation between pRb expression and the severity 
of dysplasia. In contrast, normal bronchial epithelium and cells taken from areas adjacent to 
tumors expressed pRb positive nuclei. These data suggest that alterations in the expression of the 
Rb protein may arise early in the development of some lung cancers. 
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[0188] Patients with Rb-positive carcinomas tend to have a somewhat better prognosis but, in 
most studies, the difference is not significant. However, patients with adenocarcinoma whose 
timiors are both pRb negative and eitlier p53 or ras positive exhibit a decrease in 5-year survival. 
A similar, relationship does not occur in squamous cell carcinoma. pRb negative tumors have 
been reported to be more likely to exhibit resistant to doxorubicin than Rb-positive carcinomas. 

Thrombomodulin [142-147] 
[0189] Thrombomodulin is a transmembrane glycoprotein. Tlirough its accelerated activation 
of protein C (which in turn acts as an anticoagulant by binding protein S and thrombin), synthesis 
of TM is one of several mechanisms important in reducing clot formation on the surface of 
endothelial cells. The cellular localization of thrombomodulin is the cell surface. 
[0190) Aggregation of host platelets by circulating tumor cells appears to play an important 
role in the metaistatic process. Thrombomodulin plays an important role in the activation of the 
anticoagulant protein C by thrombin and is an important modulator of intravascular coagulation. 
In addition to its expression in normal squamous epithelium, expression of thrombomodulin also 
occurs in squamous metaplasia, carcinoma in situ, and invasive squamous cell carcinomas. 
Although present in 74% of primary squamous cell carcinomas, only 44% of metastatic lesions 
stained for thrombomodulin. These data suggest that, with progression, there is a decrease in 
thrombomodulin expression. Higher levels of expression tend to occur in well and moderately 
differentiated tumors when cornpared to poorly differentiated tiunors. 

[0191] Patients with thrombomodulin-negative squamous cell carcinoma tend to have a worse 
prognosis. Eighteen percent of patients with thrombomodulin-negative have a five-year survival 
as compared to 60% in cases where the tumors stained positive for the protein. Progression to 
metastatic disease was also more common in thrombomodulin-negative tumors (69% vs. 37%) 
and there was a greater tendency for these tumors to develop at extrathorasic sites. Thus, loss of 
thrombomodulin expression appears to be prognostic in cases of squamous cell carcinoma. The 
observation that changes in thrombomodulin expression occur in later stages of NSCLC and that 
the protein is expressed by normal bronchial epithelial cells would tend to limit its utility as a 
marker for early detection. However, since a majority of mesotheliomas and only a small 
percentage of adenocarcinomas express thrombomodulin, the marker has potential utility in 
discrii?iinating between these two tumor types. 
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E-cadherin & N-cadherin [148-151] 
[0192] E-cadlieriri is a transmembrane Ca2+ dependent cell adhesion molecule. It plays an 
important role in the growth and development of cells via the mechanisms of control of tissue 
; architecture and the maintenance of tissue integrity. E-cadherin contributes to intercellular 
adhesion of epithelial cells, the estabUshment of epithelial polarization, glandular differentiation, 
and stratification. Down-regulation of E-cadherin expression has been observed in a number of . 
carcinomas and is usually associated with advanced stage and progression. The cellular 
localization of E-cadherin is the cell surface. 

[0193] E-cadherin is a calcium-dependent epithelial cell adhesion molecule. A decrease in E- 
' cadherin expression has been associated with tumor dedifferentiation and metastasis and 
decreased survival. Reduced expression has been observed in moderately and poorly 
differentiated squamous cell carcinoma and in SCLC. There was no change in E-cadherin 
expression in adenocarcinoma. Furthermore, while adenocarcinomas express E-cadherin theses 
tumors fail to express N-cadherin which is in contrast to mesotheliomas that express N-cadherin 
but not E-cadherin. Thus, these markers can.be used to discriminate between adenocarcinoma 
and mesothelioma. 

[0194] Expression of E-cadherin can also be used to assess the prognosis of patients with 
squamous cell carcinoma. Whereas 60% of patients with tumors expressing E-cadherin survived 
three-year survival, only 36% of patients exhibiting a reduction in expression survived 3 years. 

MAGE-1 and MAGE-3 [152-156] 
[0195] Melanoma Antigen- 1 (MAGE-1) and Melanoma Antigen-3 (MAGE-3) are members of 
a family of genes that are normally silent in normal tissues but when expressed in malignant 
neoplasms are recognized by autologous, tumor-directed and specific cytotoxic T cells (CTL*s). 
The cellular localization of MAGE-1 and MAGE-3 is cytoplasmic. 

[0196] MAGE-1 , MAGE-3 and MAGE 4 gene products are tumor-associated antigens that are 
recognized by cytotoxic T lymphocytes. As such, they could have utility as targets for 
immunotherapy in NSCLC. MAGE proteins are also expressed by some SCLCs but not by • 
normal cells. While the frequency of MAGE expression falls below the level necessary for use 
as a detection marker, differences in the pattern of expression between histologic types suggest 
that MAGE expression may have utility as differentiation markers. This utility is also supported 
by the observation that, in 50% of squamous cell carcinonia greater than 90% of tumor cells 
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showed evidence of MAGE-3 overexpression with 30% to hiihors exhibiting overexpression in.at 
least 50% of .cells. . 

Nucleolar Protein (pl20) [ 1 57] 
[0197] pl20 (proliferation-associated nucleolar antigen) is found in the cells of nucleoli of 
rapidly proUferating cells during early Gl phase. The cellular localization of pi 20 is nuclear. 
[0198] Nucleolar protein pi 20 is a proliferation-associated protein whose function has yet to be 
elucidated. Strong stauaing has been detected in tumor tissue but not in macrophages or normal 
tissue. Overexpression of pi 20 was more common in squamous cell carcinoma that in 
adenocarcinoma or large cell carcinoma raising the possibihty that this marker may have utility 
in discriminating between tumor types. 

Pulmonary Surfactants [83,158-166] 
[0199] Pulmonary surfactants are a phosphoUpid-rich mixture that functions to reduce the 
surface tension at the alveolar-liquid interface, thus pro^/iding the alveolar stability necessary for 
ventilation. Surfactant proteins appear to be expressed exclusively in the airway and are 
produced by alveolar type n cells. In the non-neoplastic limg, pro-surfactant-B immunoreactivity 
is detected in normal and hyperplastic alveolar type n cells and some non-ciUated bronchiolar 
epithehal ceils, Sbcty percent ofadenocarcinomas contained strong cytoplasniic 
. immunoreactivity with 10-50% of tumor cells exhibiting staining the majority of cases. 
Squamous cell carcinoma arid large cell carcinoma failed to stain for pro-surfactant-B. 
[0200] Surfactant Apoprotein B (SP-B) is one in four hydrophobic proteins that make up the 
pulmonary surfactant, which is a phospholipid and protein complex secreted by type II alveolar 
cells. Squamous cell and large cell carcinomas of the lung and nonpulmonary adenocarcinomas 
do not express SP-B. The cellular localization of SP-B is cytoplasmic. 
[0201 J SP-A is a puhnonary surfactant protein that plays an essential role in keeping alveoli 
from collapsing at the end of expiration. SP-A is a unique differentiation marker of pulmonary 
alveolar epithelial cells (type II pneumocytes); the antigen is preserved even in the neoplastic 
state. Tlie cellular localization of SP-A is cytoplasmic. 

[0202] Pulmonary surfactant A appears to be specific for non-mucinous bronchoiolo-alveolar 
carcinoma with 100% staining as compared to none of the of mucinous type. Pulmonary 
surfactants potentially have utility in discriminating lung cancer fi'ora other cancers metastasized 
to lung. In addition to tiunor cells, non-neoplastic pheumocytes also stain for pulmonary 
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. surfactant A. As with pulmonary surfactant B staining for pulmonary surfactant A is relatively 
common in adenocarcinoma but not in other forms of NSCLC or in SCLC. Mesothelioma also 
. fails to express pulmonary surfactant A leading to the suggestion that pulmonary surfactant A 
: . may have utility in the discrimination between adenocarcinoma arid mesothelioma. 

Ki-67 • • • ' , 

[0203] Ki-67 is a nuclear protein that is expressed in proliferating normal and neoplastic cells 
and is down-regulated m quiescent cells. It is present in Gl, S, G2, and M phases of the cell 
cycle, but is absent in Go phase. Commonly used as a marker of proliferation. The cellular 
localization of Ki.-67 is nuclear. 



57 



wo 2004/025251 



PCT/US2003/028379 



Table 5: 



Marker . . 


Squamous Cell 
Carcinoma 


Adenocarcinoma 


Large Cell 
Carcinoma 


6mall Cell 
Carcinoma 


Mesothelioma 


Gluci 


lOO.O" 


64.5 


80.5 


64.0 


NDA* 


Gluc3 


17.5 


16.0 


39.5 


9.0 


NDA'* 


HERA 


lOO.O 


100.0 


100.0 


NDA 


4.5 


Basic FGF 


83.0 


.48.7 


50.0 


100.0 


NDA 


Telomerase . 


82.3 


86.3 


93.0 


66.7 


NDA 


PCNA 


80.0 


69.8 


87.7 


51.0 


NDA 


CD44v6 


79.3 


34.8 


44.2 


0.0 


NDA 


Cyclin A 


79.0 


68.0 


83.5 


97.0 


NDA 


CyclinDl 


42.7 


36.0 


62.0 


90.0 


NDA 


Hepaiocyte Growth 
Factor /Scatter Factor 


■ 75.5 


78.3 


100.0 


NDA 


100.0 


mJCA 


55.5 


90.0 


100.0 


100 


NDA 


TTF-1 


38.0 


76.0 


NDA 


83.0 . 


NDA 


VEGF 


61.8 


68.3 


100.0 


43.5 


NDA 


■ EOF Receptor 


63.1 


45.3 ' 


96.0 


Frequently 


NDA 


nin23 


68.0 


52.6 


83.5 


73.5 


. NDA. 


Bcl-2 


45.5 


43.3 


42.5 


92.0 


NDA 


Loss of pRb 
Expression 


20.1 


25.8 


35.4" 


85.3 


NDA 


Thiombcmoduliii 


66.8 


12.2 


4.0 


0.0 


81.0 


E-cadherin 


.69.0 


85.0 


NDA 


100.0 


0.0 


N-cadherin 


^fDA 


4.0 


NDA 


NDA 


94.0 


MAGEl 


45.0 


35.0 


NDA 


16.5 


NDA 


MAGE 3 


72.0 . 


33.3 . 


NDA 


33.5 


NDA 


MAGE 4 


45.5 


ll.O 


NDA 


50.0 


NDA 


Nucleolar Protein 
(pl20) 


68.0 


35.0 


30.0 


NDA 


NDA 


Pulmonary Surfactant 
B 


0.0 . 


61.5 


0.0 


NDA 


NDA 


Pulmonary Surfactant 
A 


12.0 


52.9 


17.5 


20 


0.0 



percent of tiiraors exhibiting a change in marker expression 



* No Data Available 

a. Obtaining.aLibraryofMarkers of a Suitable Size 

[0204] Preliminary pruning steps were required in order to obtain a suitable size library of 
markers that were correlated with lung cancer. IVIore than a hundred markers correlated to liuig 
cancer are known in the literature, A partial listing of candidate probes identified in tlie literature 
and evaluated for potential inclusion in panels tests include antibodies to: bax, Bcl-2, c-MET 
(HGFr), CD44S, CD44v4, CD44v5, CD44v6, cdk2 kinase, CEA (carcino-embryonic antigen); 
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• Cyclin A, Cyclin Dl, E-cadherin, EGFR, ER-related (p29). erbB-1, erbB-2, FGF-2 (bFOF), FOS, 
Glut-l, Glut-2, Glut-3, Glut-4, Glut-5, HERA (MOC-31), HPV-16, HPV-IS, HPV-31, HPV-33, 
HPVol, integrin VLA2, integrin VLA3, integrin VLA6, JUN, keratin, keratin 7, keratin 8, 

/keratin 10, keratin 13, keratin 14, keratin 16, keratin. 17, keratin 18, keratin 19, A-type lamins (A; 
C), B-type lamins (Bl; B.2), MAGE-1, MAGE-3, MAGE-4, melanoma-associated antigen clone 
NKI/C3, mdm2, mib-1 (Ki-67), mucin 1 (MUC-1), mucin 2 (lVIUC-2), mucin 3 (MUC-3), mucin 
4 (MUC-4), MYC, N«cadherin, NCAM (neural cell adhesion molecule), nm23, pl6, p21, p27, • 
p53, pl20, P-cadherin, PCNA, Retinoblastoma, SP-A, SP-B, Telomerase, Thrombomodulin, 
Thyroid Transcription Factor 1, VEGF, vimentin, and wafl. The initial list of markers was 
pruned by initially assessing, from the literature, the apparent effectiveness of the probes in 
detecting early stage cancer cells, discriminating between cells of differing cancer states, and 
localizing the label to.the target cancer cells. This list of markers was flirther pruned by 
removing markers whose utilization would be difficult to reduce to practice because they are 
difficult to produce or obtain, have unsuitable detection technology reqliirements or.poor . 
reproducibility of reported results. After all of the pruning steps were complete, a library of 27 
markers was obtained. 

b. Optimizing Protocols and Obtaiiung Gold Standard Liing Cancer Samples 

[0205] Preliminary preparation steps were also required prior to obtaining the panels. The 
probes containing appropriate labels Were available from conmiercial vendors. The protocols of ' 
the probes were analyzed for optimum objective quantitative detection. For example, it was 
determined that the concentration of PCNA was too low. Originally. PCNA was diluted 1:4000 
in S809 buffer. A second dilution was made, which was 1 :3200 in 3809. The optimized 
protocols for each marker is shown in below. It is noted that the second column is labeled 
"Antibody Name". Except for MOC-31, the probes in this list are listed by the marker name 
because many of the vendors refer to the antibody by the name of the marker. It is noted that an 
altemative way these reagents might be listed is, for example, anti-VEGF, anti-ThrombomoduIin, 
anti-CD44y6, etc. 

[0206] Gold standard tissue specimens were obtained from UCLA. Tissue specimens were 
received from two. sources. Cases had been diagnosed using standard procedures including 
review of hematoxylin and eosin (H&E)-stained slides and the clinical history . Specinaen slides 
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were coded and labeled with arbitrary numbers to blind the study pathologists to the historical 
diagnosis and antibody marker and to protect patient confidentiality. 

[0207] Specimen slides With tissue sections from cancerous and nori cancerous (control) tissues 
■ were used. A total of 175 separate cases were analyzed. Within this set, the following diagnoses, 
located in Table 6 were present with the following frequencies: 
Table 6: 





Diagnosis 


Number of occurrences 




Adenocarcinoma 


25 


u 
«u 


Large Cell' Carcinoma 


18 


o 

r-* 


Mesothelioma 


26 . 


U 


Small Cell Lung Cancer 


20 




Squamous Cell Carcinoma 


24 




Emphysema 


34 


1 


Granulomatous Disease 


3 


c 
o 
U 


Interstitial Lung Disease 


. . 25 



c. Determination of the Level of Expression of the Panel of Molecular Markers 
[0208] Sufficient specimen slides were prepared for each case so that only one probe was tested 
per slide. In general, a microscope slide is prepared which contains the cytologic sample 
contacted with one or more labeled probes that are directed at particular molecular, markers. 
Independently, each study pathologists examined an H&E-stained slide to make a diagnosis for 
each case, and then examined each probe-reacted and iramunochemically-stained slide to assess 
the level of probe binding, recording the results on a standardized data form. 
[0209] In greater detail, the immunohistochemical staining was performed on formalin fixed, 
paraffin embedded (FFPE) tissue. Tissue sections were cut at 4 microns thick on poIy-L-Lysine 
coated slides and dried at room temperature overnight. De-paraffinization and rehydration of the 
tissue sections were performed as follows: To completely remove all of die embedding medium 
from the specimen the slides were incubated in two consecutive Xylene-substiUite (Histoclear) • 
baths for five minutes each. All liquid was tapped off the slides before incubation in two 
consecutive baths of 100% reagent grade alcohol for three minutes each. Once again all excess 
liqiud was tapped off the slides before being incubated in two final baths of 95% reagent grade 
alcohol for three minutes each. After the last bath of 95% the slides were rinsed in tap water and 
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held in wash buffer (Tris-buffered saline wash buffer containing 0.05% Tween 20 corresponding 
to a 1 : 10 dilution of DAKO Autostainer Wash buffer, code S3306). Table 7, below, presents a 
complete list of the reagents used in this study along \yith corresponding product code numbers. ' 
Detection systems used in the study were DAKO Enyision+ HRP mouse (code K4007) or rabbit . 
(code K4003) and LSAB+ HRP (code K0690). The protocols for immuno assaying were 
followed according to the package inserts. The kits contained liquid two component DAB+ 
substrate chromogen (code K3468). 
Table?: 



Reagents used in the Study 








Reagents 


Code# 


National Diagnostics HistoCtear 


HS-200 


Mallinckrodt Reagent Alchohol Absolute . . 


7019-10 


DAKO Antibody Diluent 


S809 


DAKO Baci<ground Reducing Antibody Diluent 


S3022 


DAKO Autostainer Buffer 10X 


S3306 


DAKO Target Retrieval Solution 


S1700 


DAKO Hi pH Target Retrieval Solution 


S3307 


DAKO Proteinase K . 


S3020 


Rite Aid Hydrogen Peroxide 3% 


None 


DAKO Protein Block Serum Free 


X0909 . 


DAKO Goat Serum 


X0501 


DAKO Swine Serum 


X0901 


DAKO EnVision+ Mouse 


K4007 


DAKO EnVision+ Rabbit 


K4003 


DAKOLSAB+ 


K0690 


DAKODAB+ 


K3468 


DAKO Hematoxylin 


S3302 


DakomoLint Mounting Media 


S3025 






Instruments 


Serial Numbers 


DAKO Autostainers 


3400-6613-03 , 




3400-6 142R-03 


Autostainer IHC Software Version V3.0.2 





[0210] Pretreatments were critical in optimizing these antibodies on lung tissue. For antibodies 
requiring enzyme digestion, DAKO Proteinase K (code S3020) was used| for 5 minutes at room 
temperature. Antibodies requiring heat induced target retrieval received pretreatment using 
either DAKO Target Retrieval Solution (code S1700) or DAKO High pH Target Retrieval 
Solution (code S3307). Tissues were placed in a pre-heated Target Retrieval Solution and 
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incubated in a 95®C water bath for 20 or 40 minutes depending on the specific protocol. Tissue, 
sections were then allowed to cool at room temperature for an additional 2D minutes. . 
[021 1] After de-paraffinization, rehydration and tissue pretreatment, ail specimens were 
•incubated in a solution of 3% hydrogen peroxide to quench endogenous peroxidase activity. ^ 
Blocking reagents were used specifically for the two antibodies FGF and Telomerase in order to 
minimize nonspecific background. 

[0212] As shown in Table 8, below, tissue specimens were incubated for a specified length of 
time with 200 micro liters of the optimally diluted primary' antibody. It is noted that the 
numbering of the markers/antibodies in Table 8 is consistent with the numbering of the antibody 
probes and markers throughout this document. Slides were then washed in D AKO IX 
Autostainer Buffer (code S3 3 06), Depending on the antibody, the correct detection system was 
applied. The steps and total incubation times for the DAKO EnVision+ HRP and LSAB+ HRP 
detection systems are shown in Table 9, below. The color reaction is developed using 3,3'- 
diaminobenzidine (DAB) resulting in a brown color precipitate at the site of the reaction. 
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. Table 9: 



Datection Systems Used in the Study 






















Steps 












1 


Oeparafinization and rehydratian 












2 baths of Histoclear for Smins each 












2 baths of 100% alchohoi for 3 mins each 












2 baths of 95% alchohoi for 3 mins each 












Waler Rinse 










2 


Pretreatments 












TRS 40 or 20 mins 












High pH TRS 20 mins 












Proteinase K for 5 mins 












Water Rinse 










3 


Peroxidase block 












Peroxide bath for 5 mins 












Water Rinse 












Buffer for S mins 












Protein Blodk for 30 mins after H202 Block 










4 


Primary Ab 










'I 


30 mins or Overnight at room temp 










5 


Oetectian System 












EnV* Systems 


Labelled Polymer 


OR 


LSA8* System 


Secondary Reagent 






30 mins 






15 mins Secondary Ab link 
























Tertiary Reagent 












iSminsSA-HRP 


6 


Chromogen 














Chramogen 






Chromogen 






10 mins OAB-^ 






5 minsDAB-*- 



[0213] Following immimostaining all slides were incubated in DAKO Hematoxylin (code 
S3302) for 3 minutes and coverslipped using DAKOMpunt Mounting Media (S3025). All 
protocols were run on DAKO Autostainers (serial #'s 3400-6612-03 & 3400-6142R-03) using 
the IHC software version 3.0.2. 

[0214] Immunostaining was viewed under a light microscope to determine that controls were 
correctly stained and tissues were intact Slides ^yere labeled, boxed and sent to designated 
pathologists for results interpretation. Trained pathologists identified the type of cancer or other 
lesion seen in the samples. Trained pathologists assessed the sensitivity to the marker probe by 
estimating the staining density and proportion of cells stained. These scores were entered in a 
data sheet for that patient. The pathologists were blinded to the original diagnosis and antibody 
marker used in the inmiunostaining. . Each slide was read by at least two pathologists and results 
recorded oii a data collection form. To provide additional integrity to the process, the method is 



63 



wo 2004/025251 



PCT/US2003/028379 



repeated with a second or third pathologist. The scores obtained can then be matched to identify 
data entry' errors. The additional data also faciUtates a better classifier design. 
[0215] For each case, up to 27 shdes were analyzed, each stained for a marker coded with 
numbers 1 through to 17, 19 through to 28, Staining for marker 18 (C-MET) could not be 
optimized and the marker/probe was therefore not used. Pathologist 1 scored slides from all 175 
cases. Pathologist 2 scored slides from 99 of the cases. Pathologist 3 scored slides from 80 of 
the cases. 

[021 6] Table 1 0 below shows how many cases of each diagnosis each pathologist scored slides 

from: 

Table 10: 





Diagnosis 


Pathologist 


Pathologist 


Pathologist 




1 


2 


• 3 




A-denocarcinoma 


25 


12 


14 


c 


Large Cell Carcinoma 


18 . . 


9 


9 


o 


Mesothelioma 


26 


14 


8 


o 


Small Cell Lung Cancer 


20 


12 


6 




Squamous Cell Carcinoma 


24 


13 


11 




Emphysema 


34 


23 


13 


"o 
h 


Granulomatous Disease 


3 


3 . 


2 


s 
o 


Interstitial Lung Disease 


25 


13 


17 



[0217] For the purposes of some selected statistical analysis techniques, it was necessary to 
consider only those cases that had scores for all 27 sUdes present. Table 1 1 below shows how 
many cases of each diagnosis were complete in terms of having scores from all 27 slides. 
Table 11: 





Diagnosis 


Pathologist 

1 


Pathologist 

2 


Pathologist . 

3 




Adenocarcinoma 


14 


10 


8 


u 

u 


Large Cell Carcinoma 


12 


9 


3 


Cane 


Mesothelioma 


17 


13 


3 


Small Cell Lung Cancer 


7 


9 


1 




Squamous Cell Carcinoma 


12 


13 


4 




Emphysema 


32 


21 


1 


g 


Granulomatous Disease 


2 


1 


0 


c 
o 

a 


Interstitial Lung Disease 


23 


7 
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[0218] From this table, it can be calculated that each pathologist scored the following total 
number of complete cases. Pathologist 1 scored all 27 slides for 119 of the cases Pathologist2 
: scored all 27 slides for 83 of the cases. Pathologist 3 scored all 27 slides for 23 of the cases. 
[0219] The total number of cancer data points is 172. This comprises 113 data points from 
Pathologist 1 and 60 data points from Pathologist 2, The total number of control data points is . 
IQl. This comprises 62 data points from Pathologist 1 and 39 data points from Pathologist 2. . 
[0220] Figure 3 shows a comparisons between H-scores for probes 7 and 15 in control tissue 
and in cancerous tissue. The x-axis shows the H-scores while the y-axis shows the percent of 
cases with that particular H-score. The difference in H-scores is apparent. 
[0221] For each patient the scores were entered electronically into a Pathology Review Form 
which coasolidates the scores into a data base showing the patient identifier together with 
diagnosis, proportion of cells stained, and staining density. The proportions and density were 
consolidated into a single "H-Score" obtained by grading the intensity as: none = 0, weak = 1, 
moderate = 2, intense = 3, and the percentage cells as: 0-5% 0. 6-25% = 1, 26-50%= 2, 51- 
75% = 3, >75% = 4, and then multiplying the two grades together. For example, 50% weakly 
stained plus 50% moderate stained would score 10 = 2x2 + 2^3. This is the standard scoring 
system throughout the analysis, except for the section 3(f), below, titled "Effect of Using other 
(non-H-score) objective scoring parameters", which investigates alternative scoring systems. 
[0222] Standard classification procedures were used to find the best combination of probes. 
Typically these use a search procedure such as the "Branch and Bound Algorithm" to find a 
hierarchy of the;best features, ranked according to a test of discriminating power, and tmncated 
according to a test of significance. This process also defines the decision rule or rules for best 
classification. 

[0223] The performance of a classifier designed with these features can be estimated from the 
data used to design the classifier. The straightforward application of all the design data to the 
classifier gives a very unsound estimate of performance. 

[0224] The analysis of the data collected in the present example provide the optimum selection 
of probes which provided the best separation of classes. Therefore, panels ^yere obtained that 
only needed a few probes to perform the analysis. However the data showed that near-optimum 
performance could be obtained with other combinations of probes. Hence, the invention is 
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flexible in being adaptable to the availability of probes where cost or supply problems may not 
allow the very best combination. In some cases, the invention can simply be applied to. the 
■available featiires to find an altemative combination. In other cases, the algorithm may be used to 
■ select features which allows cost weightings to be included in the selection process to arrive at a 
low cost solution. 

(0225] The design of data collection and analysis experiment was chosen to avoid biases 
tiirough the well established double blind procedures where data collection and data analysis 
were done independently. 

[0226] In the first case the pathologists reviewed slides with conventional staining to allow a 
diagnosis to be made. This diagnosis was entered on the Pathology Review form. The 
pathologists were then presented, in random order, with slides stained by the marker probes for 
scoring the percentage of cells stained and the relative intensity of the staining. The slides were 
numbered to exclude information about the probe from the pathologist. To allow data mtegrity to 
be checked two pathologists reviewed all patients. 

[0227] Data were consolidated into a database that was then reviewed by a team of statisticians. 
Probes were numbered to render their method of action as unseen during the analysis of their 
effectiveness. 

[0228] The first stage of the analysis was to check the integrity of the data by comparing entries 
for each patient. Where large differences were found, the data entries were checked and any 
obvious errors were corrected. Unexplained differences were left in the data. 
[0229] The data were then separately analyzed by four statisticians, using different techniques 
in recognition of the fact that different statistical methodologies are suited to different types of 
discriminating information in the data. 

[0230] The first step in the process of selecting the best probe combination is to divide the data 
into two sets, one for designing a classifier and one for testing the performance of the classifier. 
By selecting the design made with the design (train) set, but showing the best performance 
evaluated on the test set, it can be concluded with confidence that the classifier has generalized to 
the structure of the data and not adapted to. particular cases seen in the training set. 
[0231] In order to test for reliabihty the analysis was typically repeated with many randomly 
selected sets of training data and test data. This approach is generally accepted as giving good 
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estimates of the classifier performance. Where these tests showed inconsistent selections of 
probes siich probe selections were discounted as unreliable. 

d. Statistical Analysis and/or Pattern Recognition 
1. Introduction to Data Analysis 

a. Input Data 

i. Raw Data 

[0232] For each patient the scores were entered electronically into a Pathology Review Form 
that consolidates the scores into a database showing the patient identifier together with diagnosis, 
proportion of cells stained, and staining density. 

ii. Computed data 

[0233] The efficiency of the score for each probe used in the analysis is computed fi-om the 
intensity/percentage tables. The proportions and density are consolidated into a single "H-Score" 
with a simple rule H= proportion stained x (3 if intense+2 if moderated 1 if weakly stained). This 
is the feature value associated with that probe. 

iii. Alternative computed d^ita parameters 

[0234] The H-score described above was heuristically derived, a simple analysis to find a better 
way of conibining percentages and intensity failed to show a significant improvement over H- 
score (Section 3(f), titleid "Effect of Using other (non-H-score) objective scoring parameters"). A 
larger data base may allow the extraction of a better rule in future. 

iv. User supplied weighting criteria per marker 

[02351 The invention is flexible in being adaptable to the availability of features where cost or 
supply problems may not allow the very best combination. For example, the invention can 
simply be applied to the available features to find and altemative combination. Alternatively, the 
algorithm used to select features allows cost weightings to be included in the selection process to 
arrive at a minimum cost solution. Marker performance estimates are shown for combinations 
selected firom all the markers collected or only those from one supplier. It is also shown how the 
C4,5 package can be used to down weight certain probes, say on the basis of their high cost. 
These probe combinations do not perform as well as the optimum combination, but the 
performance might be acceptable in circumstances where cost is a significant factor. 
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V. User supplied weighting criteria per class 
[0236] Some of the methods used allow weightings to be applied to the classes. This is 
available in G4.5 where the tree design can optimize the cost. Also the Discriminant Function 
method gives a single parameter output which can be used to give a desired false positive or ..false 
negative probabihty. A plot of these parameters for different threshold settings is known as the 
Receiver Operating Curve. 

vi. Detection panels - assumptions 

[0237] A low probability of a false negatives was assumed to be desirable for the cancer 
detection process (to avoid positive patients being missed at the cost of an increased number of 
false positives who would require re-screening). It was also assumed that the cancer 
discrimination process would require a lower false positive score (to minimize patients receiving 
the wrong treatment). 

[0238] It was assunied that detection panels requiring 6 or more probes to achieve an 
acceptable performance would not be cost effective. It was also assumed that a detection panel 
with a false negative error rate of more than 5% would not be acceptable. Panels falling outside 
this box are not accepted. This assumption acknowledges that cytometric panels are hkely to 
have a worse performance than the histology based panels analyzed here. The ultimate aim will 
be a cytometric panel which performs better than 20% error rate, this being approximately the 
performance of cervical Pap sniear screeners. 

vii. Discrimination panels - assumptions 

[0239] * It was assumed that panels requiring 6 or more probes are not cost effective and it was 
assumed that an enar rate of better than 20% is required. Panels falling outside this box were not 
accepted. 

b. Output data 

Outputs provided by the present analysis included: . 
Confusion Matrices, showing how data from the test set was classified as either true 
. positive, false positive, true negative or false negative. These may be shown as actual 
counts or as percentages. Confusion matrices are discussed in section 2(d) titled 
'Terformaiice Metrics". A confusion matrix shows how data firom a test set v/as 
classified as either true positive, false positive, tnie negative or false negative. An 
exemplary confusion matrix, obtained from data analyzed by decision trees, is shown 
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below in table 12 for simultaneous discriinination of adenocarcinoma, squamous cell 
carcinoma, large cell carcinoma, mesothelioma and small cell carcinoma. 
Table 12: . 





Adeno 


Squamous 


Large Cell 


Mesothelioma 


Small Cell 


Adeno 


67.74% 


6.45% 


19.40% 


0.00% 


6,45% 


Squamous Cell 


2.94% 


75.47% 


11.67% 


0.00% 


8.82% 


Large Ceil 


28.00% 


8.00% . 


44.00% 


8.00% 


12.00% 


Mesothelioma 


0.00% 


25.64% 


51.28% 


89.74% 


2.56% 


Small Cell 


0.00% 


3.85% 


23.08% 


3.85% 


69.23% 



. Error Rates, summarizing data in the confiision matrix as the sum of all false 
classifications divided by the total number of classifications made expressed as a 

percentage. . 

Receiver Operating Characteristic (ROC) curves show the estimated percentage (or per 
unit probability) of false positive and false negative scores for different threshold levels in 
the classifier. An indifferent classifier, imable to discriminate better than random choice 
would present a ROC curve with equal true and false readings. The area under this curye 
would be 50% (0.5 probability). 

• . Area Under the Curve (AUG) is often used as an overall estimate, of classifier 

performance and most standard discriminant futiction packages provide this AUG figure. 
A perfect classifier would have 100% Area Under the Curve, and a useless classifier 
would have an AUG near 50% (0.5 probability). 

• Sensitivity and specificity (can be derived from the confusion matrix); See section 
.3(d)(iii) titled "Sensitivity and Specificity". 

Marker correlation matrices. See Figure 4. 

i.- Detection panels: composition 

[0240] These panels are trained on data divided into two classes, patients with any of the five 
cancers and patients witli none of the cancers. Not all probes were present for all patients. Where 
one or more probes were missing for a particular analysis these cases were excised from the data. 
Hence, where analysis was imdertaken on reduced niunbers of probes the data set might include 
slightly more cases. 

[0241] The number of probes included in the analysis was 27. Although in many cases a false 
probe was added where the data entered for that probe was frona a random number generator set 
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to generate numbers uniformly between zero and 12. This false probe was included in much of 
the early analysis to ensture integrity in the probe selection process. This false probe was also 
used in one approach to progressively eliminate probes from the analysis. Probes that contributed 
less infonnation than the false probe could be readily identified and excluded from the selection 
process. Early elimination of such probes speeds the analysis and renders the analysis less 
vulnerable to variations in results (noise) caused by these probes. 

ii. Detection Panel Performance 

[0242] As outputs from this saidy,. the probe combinations selected by the different 
methodologies and their performance estimates in tenns of the confusion matrix, % error rate, 
and AUG are reported. 

iii. Detection Panels - alternative compositions 

[0243 J Detection panels were also selected from reduced sets of probes. In one set of panels, 
performance measures of panels weighted for commercially preferred markers were obtained. 
The performances obtained when the best probe was removed from the analysis to find a new 
combination of discriminating probes was also analyzed. The performance of a single probe 
acting on its own was found to be very high (probe 7). However, as shown below in the 
performance diagrams. Table 13, evaluated using linear discriminant analysis, the performance 
was improved as more markers were added. The best subsets of probes were determined using 
best subsets logistic regression, The improvement is statistically significant. 
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Table 13: 







Cancer 


Control 




Cancer 


87.93% 


12.07% 


Probe 7 . 


.Control 


• 0.00% 


100.00% 



Probes 7 and 16 





Cancer 


Control 


Cancer 


. 93.10% 


6.90% 


Control 


1.16% 


98.84% 



Probes 7, 15 and 16 





Cancer 


Control 


Cancer 


90.52%. 


9.48% 


Control . 


. 1.16% 


98.84% 



Probes 1, 7. 15, and 16 



Probes 1,4, 7. 15, and 16 





Cancer 


Control 


Cancer . 


90.52% 


9.48%' 


Control 


0.00% 


100.00% 






Cancer 


Control 


Cancer 


92.24% 


.7.76% 


Control 


1.16% 


98.84% 



[0244] The best and second best subsets of probes (detennined using best, subsets logistic 
regression) and evaluated using- logistic regression is shown below. AUC=Area under ROC 
curve. It is noted that mean AUG is the average from 100 trials on random train and test 
partitions (70%:30%), The results are shown below, in Table 14. 
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Table 14: 



Probes 


Mean AUC 


7 


94.28 


28 


80.14 


7,16 


• 95 


7, 15 


94.59 


7.15.16 


95.94 


1. 7, 16 


95.33 


1.7, 15, 16 


95.61 


4,7.15,16 . 


95.34 


1.4, 7. 15. 16 


95.3 


1,7,11.15,16 


95.57 



iv. . Discrimination panels - composition 

[0245] For this part of the study five classifiers were designed and tested, each designed to 
detect the presence of one of the cancer from all patients with cancer. The application of this five 
way pair-wise system allows doubtful cases to appear more than once in the analysis, or not at . 
all. Such cases can be identified and subjected to closer scrutiny, re-testing or alternative testing 
regimes, 

[0246] x^gain tlie number of probes in the study was 27, with a false probe used in the early 
stage to reduce die numbers in the analysis. 

v. Discriminant panels - performance 

(0247] The performance estimators described above were used to show the performance of the 
best probe combinations discovered by the different techniques. 

vi. Discriminant panels - alternative conipositioh 

[0248] The analysis was repeated for a probe combination comprising commercially preferred 
probes. Performance was degraded, but not unusable for several reduced-set classifiers. Below, 
the best subsets of probes without probe 7, determined using best subsets logistic regression), is 
shown, as Table 15. The data was evaluated using Unear discriminant analysis. 
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Table 15: 







Cancer 


Control 




Cancer 


. 0.706897 


0.293103 


Probe 28 


Control 


0.093023 


0.906977 



Probes 10 and 28 





Cancer 


Control 


jCancer 


0.793103 


0.206897 


1 Control 


0.034884 


0.965116 



Probes 10. 15 and 28 




Cancer 


Control 


Cancer 


0.810345 


0.189655 


Control 


0.011628 


0.988372 






Probes 1,10. 15 and 28 




Cancer 


Control 


Cancer 


. 0.827586 


0.172414 


Control 


0.011628 


0,988372 






Probes 1. 10. 15. 16 and 28 




Cancer 


Control 


Cancer 


0.827586 


0.172414 


Control 


0.011628 


0.988372 



[0249] The best and second best subsets of probes with probe 7 (determined using best subsets 
logistic regression) and evahiated using logistic regression is shown below. AUC=Area under 
ROC curve. It is noted that mean AUG is the average from 100 trials on random train and test 
partitions (70%:30%). The results are shown below, in Table 16. 
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Table 16: 



Probes 


Mean AUG 


CO 


79 36% 


10 


32.28% 


10, 28 


94.21% 


15.28 


88.68% 


10.15,28 


92.90% 


1. 10. 28 


93.59% 


1, 10.15,28 


92.99% 


8,10,15.28 


93.20% 


1, 10, 15, 16, 28 


93.13% 


1.8,10,15,28 


93.57% 



2, Data analysis methodology 

[0250] In this section , the process of gaining an initial understanding of the. structure of the 
data as a guide to interpreting results from the different methodologies used is described. . 
a. Analysis of variance 

i. Pathologist-to-pathologist variability and pooling pathologist 

scores. 

(1) t' Test 

[0251] Two pathologists reviewed each patient's slides in tliis clinical trial. Pathologist I 
reviewed all patients, Pathologist 2 also reviewed approximately half of this set and Pathologist 3 
reviewed the remainder. With two independent estimates of the H-score, the consistency of 
pathologist perfonnance could be tested. 

[0252] A readily available statistical tool was used to test the variability between pathologists. 
This is the paired-sample t-test! This takes the difference ^between each pair of estimates, 
averages these and expresses this as a proportion of the overall variances. The t-test then converts 
tliis ratio into a probability estimating the likelihood that the two samples sets came from the 
same population (the P value). 

[0253] This test was applied to the scores for each marker probe, for all cases reviewed by 
Pathologist 1 and Pathologist 2, and also for all cases reviewed by Pathologist 1 and Pathologist 
3. Since there were 27 tests applied (to cover all probes) a low value of P=0.01. was selected as 
the "significant threshold". Results, showing the P scores for each probe, and for the two pairs of 
pathologists, ai'e shown below, in Tables 17, 18,19 and 20. It is clear that Pathologist 1 aad 
Pathologist 2 were more consistent than Pathologist 1 and Pathologist 3. 
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Table 17: 



Pathologist I, Pathologist 2 scores: 





X2 


X3 ! X4 


X5 


X6 


X7 


0.5875446 


0.01051847 


0.4659704 1 0,4659704 


0.3772S94 


0.2307273 


0.01001357 



XS 


.K9 • 


XIO 


XI I 


I X12 


XI 3 


X14 


0.004131056 


0.7703014 


0.1640003 


0.2374452 


. 1 .0.9530652 


0.1587876 


0.001200265 



X15 


X16 1 X17 


X18 1 X19 


X20 


X2l 


0.19742 


0.3360899 | 0.3829022 , 


NA 1 0.544601 


0.08873848 


0.1686243 



X22 


.K23 


X24 


X25 


X26 


X27 


X28 


0.5428451 


0.1912477 


0.4031977 


0.2477236 


0.5673386 


0.9174037 


0.00339071 



Table 18: 

Pathologist 1, Pathologist 2 scores thresholded at 0.01 (a = 1% level of significance): 



XI 


X2 


X3 i X4. 


X5 


X6 


X7 


TRU 
E 


.TRU 
E 


TRUE 


TRU 
E 


TRUE 


TRU 
E 


TRUE 




X3 


X9 


XIO 


Xll 


X12 


X13 


X14 


FAL- 
SE 


TRU 
E 


TRU 
E 


TRU 
E 


TRU . 
E 


TRU 
E 


FALS 
E 



X15 


X16 


X17 


X18 


X19 


X20 


X2l 


TR 
UE . 


TRU 
E 


TRU 
E 


NA 


TRU 
E 


TRU 
E 


TRU 

E' 



X22 


X23 


X24 


X25 


X26 


X27 


X28 


TR 
UE 


TRU 
E 


TRU 
E 


TRU 
E 


TRU 
E 


TRU 

E 


FALS 
E 



Table 19: 



Pathologist 2, Pathologist 3 scores: 





X2 . 


X3 -■ 


X4 . 


X5 


X6 


X7 


3.8l4506e-09 


0.0399131 


0.1954867 


5.671 062e-05 


0.01856276 . 


0.27^7* Id^ 


02292583 




X8 


X9 


XIO 


Xll 


X12 


X13 


X14 


2.044038e-l2 


0.004166467 


0,00983267 


0.003710155 


0.01461007 


0.03312421 


0.0003367823 




X15 


X16 


X17 


X18 


X19 


.X20 


X21 


0.0005162036 


0.2276537 


0.002987705 




4.267708e-06 


0.007287372 


0.1654067 




X22 


X23 


X24 


X25 1 X26 


X27 


X28 . 


0.02400127 


0.0009497766 


.2.478456e^7 


0.1591684 1 0.083 1 S303 


3.l22l43e-05 


1 
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Table 20: 



Pathologist .1 , Pathologist 3 scores thresholded at 0.0 1 (a = 1 % level of significaace): : • 


XI 


x: j X3 


X4 


Xi 


X6 I X7 


False 


TRUE ( FALSE 


FALSE 


TRUE 


TRUE 1 TRUE 




X8 -■ 


X9 


XIO 


XU 


X12 


XU 


XU 


FALSE 


FALSE 


FALSE 


FALSE 


TRUE 


TRUE 


FALSE 




Xli 1 X16 


XI 7 


XIS 


X19 


X20 


X2l 


FALSE TRUE • • 


FALSE 


FALSE 


FALSE 


FALSE 


TRUE 




X22 


X23 


X24 1 X25 1 X26 


X27 


X28 


TRUE 


FALSE 


.FALSE . ) TRUE | TRUE 


FALSE • 


TRUE 



, [0254] Because tlie H score is subjective it is prone to scale factor differences and noise" at 
marginal cases. So, in spite of the three features which showed statistically different scores 
between 'Pathologist 1 and Pathologist 2, this joint data was accepted as. representative of a 
measuring instrument. Pathologist l and Pathologist 2 were combined into a single data set for 
the analysis process. The results for Pathologist 3 were withheld for independent testing 
' purposes! Such tests using the Pathologist 3 data would be biased towards showing an under- 
performance because of the significant differences. " 

[0255] The data from Pathologist 1 and Pathologist 2 were combined by considering them as 
separate cases, with the variability giving a degree of independence between the results for any 
one case. When testing with such data the performance estimates will be biased towards a more 
optimistic value. This is because samples coming from the same patient may occur 
simultaneously in the training a test subsets. This does not hov/ever invalidate the processes used 
to find the best combination of features, it merely biases the estimate of performance. 

(2) Analysis .of Variance of H-Scores 
(a) Background 

[0256] Within each probe, the H-scores may vary due to many reasons. To the extent they vary 
consistently due to the type of disease thiis is useful, variation due to which pathologist read the 
slide is instructive, whereas random variation sets a limit on the detection of the previous two 
sources of variation. 

[0257] Analysis of Variance (ANOVA) is a standard technique for splitting up the sources of 
variatipn in data and for testing its statistical significance, ANOVA summarizes the total 
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variation of a set of data as a sum of terms which can be attributed to specific sources, or causes,- 
of variation. . ' ' . 

[0258] ANOVA is available in many statistical packages. The public domain paclcage "R" was 
chosen C'The R Project for Statistical Computing", http://www.R-project.org/). 

(b)Aim 

[0259] To perform ANOVA analyses on the H-score data from pathologists 1 and 2 and to 
consider whether this data can be safely merged into a single consistent set for further analysis 
for the selection of paiiels. 

..(c) Methodology 

[0260] From the database, data was selected from pathologists i and 2. Only-data which was' 
complete for a given probe was used in the ANOVA for that probe. 

[0261] The control categories of iSmphysema, Granulomatous Disease, and Interstitial Lung 
Disease were grouped together and called 'TSformal" giving 6 levels within factor Disease. 
[0262] Pathologist was coded as a factor with 2 levels (Pathologist 1, Pathologist 2). 
[0263] ' An R script was written to perform a standard ANOVA analysis for each probe in tum, 
using the factors: Disease, Pathologist, and the interaction term Disease:Pathologist. The results 
are shown in below, in Table 2 1 . "Df * is defined as the degrees of freedom. In a dataset of n 
observations, Icnowing n-1 deviations from the mean, the nth is automatically determined. N-1 is 
the number of degrees of freedom. Sum Sq and mean Sq are measiu-es of variation. F is a test 
statistic concerning the equality of two variances based on the F distribution. Pr(>F) is the 
probability used to determine v/hether or not the variability is statistically significant. 
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table 21: 

Analysis of Variance of H-Scores 



Frobel 












Df Sum Sq Mean 


Sq 


F value 


?,r(>F) 


Disease 


5 443.56 88. 


71 


15.8202 


3.690e-13 *** 


Pathologist 


1 0.66 0. 


66 


0.1174 


0.7323 . . 


Disease : Pathologist 


5 15.34 3. 


07 


0.5470 


0.7405 


Rasiduals 


204 1143:93 5: 


61 






Signif. codes: 0 


0.001 '**' 0.01 ' 


* » 


0,05 ^• 


0.1 ^ ' 1 




Probe2 












Df Sum Sq Mean 


Sq 


F value 


Pr {>F) 


Disease 


5 1067.39 213 


.48 


24.1234 


<2e-16 *** 


Pathologist 


1 13.02 13 


.02 


1.4709 


0.2263 


Disease : Pathologist. 


5 27.98 5 


60 


0.6324 


0.6752 


Residuals 


249 2203,50 8 


.85 






Signif. codes: 0 


0.001 0.01 




0.05 ^ 


' 0.1 ^ ' 1 




Probe3 












Df Sum Sq Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 1098.49 219 


.70 


21.0751 


<2e-16*** 


Pathologist 


1 6.73 6 


.73 


0.6458 


0;4224 


. Disease: Pathologist 


5 29.72 5 


:94 


0.5703 


0.7227 


Residuals 


243 2533.16 10 


.42 






Signif. codes: 0 ***** 


0.001 0.01 




0.05 ^ 


' 0.1 * • 1 



Probe4 












Df Sum Sq 


Mean Sq 


F value 


Pr(>F) 


Disease 


5 631.8 


126.4 


9.3707 3 


454e-03 *** 


Pathologist 


1 6.6 


6.6 


0.4869 


'0.4860 


Disease : Pathologist • 


5 13.1 


2.6 


0.1939 


0.9647 


Residuals 


246 3317.1 


13.5 






Signif. codes: 0 ^***' 


0.001 


0.01 


0.05 * . • 


0.1 ^ ' 1 



Probe 5 












Df Sum Sq Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 754.30 150 


86 


25.2826 


<2e-16 *^* 


Pathologist 


1 14.25 14 


.25 


2.3875 


0.1236 


Disease : Pathologist 


5 ' 7.54 1 


51 


0.2528 


0.9381 


Residuals 


248 1479.80 5 


97 






Signif. codes: 0 


0.001 0.01 


• ★ » 


0.05 ^ . 


• 0.1 ^ ' 1 
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Probe6 












Df Sum Sq Mean 


Sq 


r value 


Pr{>F) 


Disease 


5 721.91 144. 


38 


11.8515 


2,7719-10 


Pathologist. . 


1. 1.91 1. 


91 


0.1568 


0.6925 


Disease : Pathologist 


5 47.82 9 


56 


0.7850 


0.5613 


Residuals 


246 2996.93 12 


18 






Signif. codes: 0 


0.001. 0.01 


* » 


0.05 


p. 1 * ' 1 




Probe7 ■ 












Df Sum Sq Mean 


Sq 


F value 


Pr(>F) 


Disease 


5.1171.47 234 


.29 


77.6802 


< 2e-16 *** 


Pathologist 


1 8.8 4 8 


.84 


2.9294 


0.08847 . 


Disease : Pathologist 


5 46.36 9 


.27 


3.0742 


0.01063 * 


Residuals 


209 630.37 3 


.02 






Signif. codes: 0 


O.Obl 0.01 




0.05 ^ 


0.1 ^ • 1 




ProbeS 












Df Sum Sq Mean 


Sq 


F value 


Pr (>?-) 


Disease 


5 209.82 41 


.96 


6.4352 


1.201e-05 


Pathologist 


1 12,66 12 


.66 


•1.9407 


0.16483 


Disease: Pathologist 


5 71.20 14 


.24 


2.1838 


0.05654 . 


Residuals 


•251 1636.76 6 


.52 






Signif. codes: 0 


0.001 0.01 




0.Q5 . 


' 0.1 ^ ' 1 



Probe9 - 












Df Sum* Sq Mean 


Sq 


F value 


Pr (>F) 


Disease 


. 5 197.21 39 


.44 


8.4348 


2.015e-07 ■ 


Pathologist 


1 7.33 7 


.33 


1.5681 


• 0.2116 


Disease : Pathologist 


5 24.56 4 


. 91 


1.0505 


0.3884 


Residuals 


265 1239.17 . 4 


.68 






Signif. codes: G 


0.001 0.01 


* ★ ' 


0.05 ^ . ' 


0.1 ^ ' 1 . 




ProbelO 












Df Sum Sq Mean 


Sq 


F value 


. Pr (>F) 


Disease 


5. 1113,46 .222 


.69 


39.0730' 


< 2e-16 *** 


Pathologist 


1. 1.01 1 


.01 


0.1778 


0.67371 . 


Disease : Pathologist 


5 ' 62.45 . 12 


.49 


2,1916 


0.05635.. 


Residuals 


213 1213.96 5 


.70 






Signif. codes:. 0 


0.001 ^**' 0-01 


* * » 


0.05 ^ 


0.1 1 
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Probell. 














Of Sum 


Sa Mean 


3q 


F value 


Pr(>F.) 


Disease 


5 . 320 


.15 64 


03 


9.5553 


2.4168-08 *** 


Pathologist 


1 1 


. 28 1 


.28 


0.1918 


0.6613 


Disease : Pathologist 


5 10 


.04 2 


.01 


0.2996 


0.9128 


Residuals 


245 1641 


.76. 6 


.70 






Signif. codes: c 


0.001 


0.01 




0.05 ' . 


0,1 ' • 1 ' 



Probel2 

Disease 
Pathologist 
Disease : Pathologist 
Residuals 



Sum Sq Mean Sq F value Pi:(>F) 

832.26 166.45 27.8793 <2e-16 *** 

0.18 0.18 0.0307 0.8610 

15.16 3.03 0.5079 0.7701 



Df 
5 
- 1 
'5 

248 1480.68 



5.97 



Signif. codes: 0 0.001 0.01 



0.05 



0.1 



Probe 13 . 

Disease 
Pathologist 
Disease : Pathologist 
Residuals 

Signif. codes : 0 ' ' 



Sum Sq Mean Sq 
46.594 9.319 
0.044 0.044 
10.143 2.029 



Df 
5 
1 
5 

210 249.584 
0.001 



1.188 



0.01 



F value 
7.8408' 
0.0368 
1.7069 



0.05 



Pr (>F) 
.674e-07 *** 
0.8481 
0.1343 



0.1 



Probel4 

Disease 
Pathologist 
Disease : Pathologist 
Residuals 



Df Sum Sq Mean Sq F. value Pr(>F) 

5 1305.69 261.14 23.9450 < 2e-16 *** 
1 28.66 28.66 2.6279 0.10630 

5 142.90 28;58 2.6208 0.02492 * 

243 2649.98 10.91 



Signif. codes: 0 0.001 ^**V 0,01 0.05 



0.1 



Probel5 
















Df Sum 


Sq 


Mean 


Sq 


F value 


Pr (>F) 


Disease 


5 401 


.02 


80 


20 


21.268 


<2e-16 *** 


Pathologist 


1 13 


.17 


13 


17 


3.493 


0.0630 . 


Disease : Pathologist 


5 6 


.17 


.1 


23 


0.327 


0.8963 


Residuals 


214 807 


.02 


3 


.77 






Signif. codes: 0 


0.00-1 ^ 




0.01 




0.05 * 


. • 0.1 ^ • 1 
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Probe 16 












Df Sum Sq Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 2520.26 504 


05 


65.5572 


<2e-16 


Pathologist . 


1 . 0.15 0 


15 


0.0194 


0.3892 


Disease : Pathologist 


. 5 24.29 4 


86 


0,6318 


0.6757 


Residuals. 


247 1899.12 7 


69 






Signif. codes: 0 


0.001 0.01 




0,05 *. 


0.1 \ • 1 



Probel7 










Df Sum Sq Mean Sq 


F value 


Pr(>F) 


Disease 


5 530.64- 106.13 


13.0178 


2.426e-ll 


Pathologist 


1 8.42 8.42 


1.0325 


0.31050 


Disease : Pathologist 


5 109.96 21.99 


2.6975 


0.02131 * 


P^esiduals 


266 .2168.55 8.15 






Signif. codes: 0 


0.001 0.01 


0.05 


0.1 * • 1 



Probel9 
















Df Sum So Mean 


Sq 


F value 


Pr (>F} 


Disease" ■ 




5 1670.86 334 


17/ 


29,1960 


<2e-16 *** 


Pathologist . 




1 2.17 2 


17 


0.1895 


0.6637 


Disease : Pathologist 


5 32,61 , 6 


52 


0.5698 


0.7231 


Residuals 




248 .2838.56 11 


4 5' 






Signif. codes: 




0.001 0.01 


* r 


0.05 ^ 


0.1 * ' 1 



Probe20 


















■ .Df Sum 


Sq Mean 


Sq 


F value 


Pr(>F) 


Disease 




5. 964, 


71 192. 


94 


34.27 60 


<2e-16 *** 


Pathologist 




1 8. 


83 . 8. 


83 


1.5687 


0.2116 


Disease : Pathologist 


5 19. 


60 3. 


92 


0.6963 


0.6267 


Residuals ■ 




245 1379. 


12 5 . 


63 






Signif. codes: 




o.pol 


' 0.01 ^ 


* » 


0.05 ^ 


OM ^ ' 1 



Probe21 












• Df Sum S.a 


Mean Sq 


F value 


Pr(>F) 


Disease 


5 ■ 6.927" 


1.385 


2,0604 


0.07076 . 


Pathologist 


1' 0.464 


■ 0.464 


0.6906 


0.40670 


Disease : Pathologist 


5 1.576 


0.315 


0.4687 


0.79945 


Residuals . . 


263 176.830 


0.672 






Signif. codes: 0 


0.001 ^**' 0 


.01 


0.05 


■ 0.1 ^ • ..1 
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Probe22 














Df Sum Sq 


Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 640.16 


128 


03 


31.7250 


<2e-16 *** 


Pathpipgist 


1 .1.64 


1 


64 


0.4058 


0.5247 


Disease : Pathologist 


■ 5 18.78 


3 


76 


0.9305 


0.4617 


Residuals 


247 996.81 


4 


04' 






Signif. codes: 0 


.0.001 :**' 


0.01 


' * 


0.05 ^ 


. ' 0.1 ^ ' 1 ' 



PrGbe23 












bf Sura Sq Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 1915.62 383 


12 


46.5565 


<2e-16 *** • 


Pathologist 


1 10.77 10 


77 


1.3092 


0.2537 ' ■ ■ ■ 


Disease : Pathologist 


5 20.92 4 


18 


0.5084 


0.7698 


Residuals 


. 246 2024.39 8 


23 






Signif. codes: 0 :*** ' 


0.001 .0-01 




0.05 * . 


• 0.1 * ' 1 



Probe24 
















■ Df Sum 


Sq 


Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 516 


.06 


103. 


21 


24.0786 


<2e-16 *** 


Pathologist 


1 9 


.52 


9. 


52 


2.2210 


0.1376 


Disease : Pathologist 


5 12 


.48 


2. 


50 


0.5823 


0.7135 


Residuals 


216 925 


.87 


4 . 


29 






Signif. codes:. 0 


O^OOl ' 


1c* ( 


0 . 01 


* * 


0.05 ' 


. ' 0. 1 ^ ' 1 



Probe25 














Df Sum 


Sq Mean 


Sq 


F value. 


Pr(>F) 


Disease 


5 1761. 


26 352 


.25 


34.5245 


<2e-15 .*** 


Pathologist 


1 11-. 


51 11 


.51 


1.1285 


0.2891 


Disease : Pathologist 


5 .41. 


49 3 


;30 


0.8134 


0.5411 


Residuals 


248 2530. 


33 10 


.20 






Signif . codes : 0 **"**• 


o.opi 


» . 0.01 




0/05 \ 


0.1 1 



Probe26 














Df Sum So 


Mean 


Sq 


F value 


Pr(>F) 


Disease 


5 399,85" 


79. 


97 


13.6548 


1.428e-ll ♦** 


Pathologist 


■ 1 .0.30 


0. 


30 


0,0517 


0.8204 . 


Disease : Pathologist 


5 14.81 


2. 


96 


0.5056 


0.7719 • 


Residuals 


.214 1253.31 


5. 


86 






Signif. codes: 0 


0.001 ^**' 0 


.01 ' 




0.05 ^. 


0.1 ^ • 1 
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Probe27 












. Of Sum Sq Mean Sq F 


value 


Pr{>F) 




Disease ■ - 


5 117.92" 23.58 


6.2551 1 


,956e-05 


* * * 


Pathologist 


1 0.54 0,64 


0.1695 


. 0.6810 




Disease : Pathologist 


5 25.52 5.10 


1.3539. 


0.2431 




Residuals 


212 799.31. 3.77 . 








Signif. codes: 0 


0.001 /**• 0.01 **V 


0.05 \' 


0.1 * • 


1 




Probe28 












Df Sum Sq Mean Sq 


F value 


Pr(>F) 




Disease 


5 1634.60 326.92 


38.171 


<2e-16 *** 


Pathologist 


1 8.40. 8.40 


0.981 


0.3229 




Disease : Pathologist 


5 ■ 16.15 3.23 


6.377 


0.8643 




Residuals 


267 2286.76 8,56 








Signif. codes: 0 


0.001 V 0.01 


0.05 * . ' 


0.1 * • 


1 



(d) . Analysis of Results 

[0264] In all cases (except for probe 21) the response of the probes.was related to disease. This 
is not surprising since the probes have presumably been selected for this purpose. In no case is 
the response of the probe related to pathologist (at the p=0.05 level). This indicates that it would 
be safe to merge this data and use the two pathologists as two measurements on the data. 
[0265] In a . few cases, probes 7, 14, 17, there is some evidence of an interaction term gaining 
significance. This indicates that there may be some diSerexice between pathologists in their 
scoring of some diseases. Some of these cases may well be due to an occasional outlier in the 
data. 

(e) Conclusions . 

[0266] The results indicate that it is safe to merge this data for further analysis. • The data 
indicate that the slight interactions in some cases between pathologist and disease appear to be 
attributed to random sources. 

ii. Patient to patient variability 
[0267] The variability from patient to patient was measured by the disease:disease variability of 
section 2(a)(i)(2) (see above, "Analysis of Variance of H-Scores"). 



83 



wo 2004/025251 



PCT/US2003/028379 



iii. Marker-to-marker variability 
[0268] Histograms were plotted (PathologistData.wXls, worksheet: Histograms) showing the 
distributioQ of marker scores for each probe for Control vs. Cancer. 

b- Marker correlation matrix analyses 
[0269] The population correlation coefficient ("Applied Mulitvariate Statistical Analysis", R. 
A. Johnson and D. W. Wichem, 2nd Ed,1988, Prentice-Hall, N.J.) measures the amount bilinear 
association between a pair of random variables. Typically the distributions and associated 
parameters of the random variables are not Icnown and the population correlation coefficient 
cannot be directly computed. In this case it is possible to compute the sample correlation 
coefficient from sample data. See Figure 4. The sample correlation coefficient is, however, only 
an estimate of the population correlation coefficient. Moreover, becauselt is calculated on the 
basis of sample data it is possible, purely by chance, that it may indicate a strong positive or 
negative correlation when in reality there may be no actual relationship between the 
corresponding random variables ("Modem Elementary Statistics", J. E. Freund, 6th Ed, 1984, 
Prentice.Hall,N.J.). ' 

[0270] The correlation coefficient measures the ability of one variable to predict the other. A 
strong linear association does not, however, imply a causal relationship. The square of the 
correlation coefficient is called the coefficient of determination. The coefficient of determination 
computed for a bivariate data set measures the proportion of the variability in one variable that 
can be accounted for by its linear relationship to the other. When dealing with several variables, 
the correlation coefficient can be calculated for each pair in turn arid the set of coefficients can be 
written as a matrix called the correlation matrix. See Figure 4. 

[0271] The H-scores for the individual markers can be modeled as random variables. The 
sample correladon matrix for this multivariate data set can be computed from the input data 
described in the section titled "Input Data", above, 
c. Pattern recognition 

[0272] Statistical pattern recognition is an approach to classifying signals or geometric objects 
on the basis of quantitative measurements (called features). Statistical pattern recognition 
essentially reduces to tiie problem of dividing the n-dimensional feature space into regions that 
correspond to the categories or classes of interest. 
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[0273] Three different classifier methodologies employed in this study are sensitive to different 
structural forms within tlie data. 

[0274] For the Decision Tree method a preliminary analysis of different data combinations 
identified, markers which were never used by C4.5 for the detection panel. These were removed 
firom the analysis and this resulted in more consistent results, symptomatic of the left-out probes 
only contributing noise to the selection process. 

[0275] Similarly a preliminary analysis of probes used in the detection panels identified the 
noisy probes for removal prior to the detailed analysis. 

[0276] The Linear Discriminant Function method in SPSS has built-in stepwise processes for 
reducing the numbers of markers in the analysis. Typically, this reduced the probes used in the 
analysis to between 2 and 7. 

[0277] The Logistic Regression method in R and S AS implement stepwise procedures for . 
variable selection. In SAS, a best subsets variable selection option. is. also provided. In R, the 
stepwise methodology was used in conjunction with multiple random trials to develop a heuristic 
method for selecting variables based on the number of times a given feature was used in 100 
random selections of training and test data (split 70%:30% respectively). Features with counts 
comparable to the count for artificial random feature were progressively eliminated until- a 
minimal consistent set of features was obtained over 100 runs. 

i. Statistical methods 
[0278] From the point of view of multivariate statistical analysis, the problem, is one of 
estimating density functions in high-dimensional space (and partitioning this space into the 
regions of interest). Assuming that the distributions of random (feature) vectors are known, the 
theoretically best classifier is the Bayes classifier because it minimizes the probabiUty of 
classification error (K. Fukunaga, "Statistical Pattern Recognition", 2""^ Ed., Academic Press 
1990, p.3). Unfortunately the implementation of the Bayes classifier is difficult because of its 
complexity, especially when the dimensionality of the feature space is high. In practice, simpler 
parametric classifiers are used. Parametric classifiers are based on assumptions about the 
underlying density or discriminant functions. The most common such classifiers are linear and 
quadratic classifiers. In multivariate statistical analysis such classifiers fall under the heading of 
discriminant analysis. Discriminant analysis techniques are closely related to multivariate linear 
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regressioQ models and generalized linear models (encompassing, logistic and multinomial 
regression). 

(1) Logistic Regression with a Binomial Response 

(a) . Background 

[0279] The problem of selecting , a set of markers to be used on a detection panel can be 
forinulated as a logistic regression problem with a binomial response. The response variable is a 
factor with two levels: normal (no cancer) and abnormal (cancer). The explanatory variables are 
the marker H-scores. 

[0280] The problem of selecting a set of markers to be used on a cancer discrimination panel 
can also be formulated as a logistic regression problem with a binomial response. The response 
variable is a factor with two levels: normal (not the cancer of interest) and abnormal (cancer of 
interest). The explanatory variables are the marker H-scores. 

[0281] Stepwise variable selection can be used to select a subset of the original variables 
(markers) for use in discriminating between the two classes. This is a computationally expensive 
exercise and is best suited to a computer: Several commercial and piibUc domain software 
packages~e.g., R, S-plus, and SAS — implement stepwise logistic regression. 
[0282] Two different approaches to feature selection were investigated based oii the stepwise 
variable selection procedures found in R and SAS respectively, 

(b) Experimental data 

[0283] The data used for the present analysis consists of the H-scores for markers 1-17, and 19- 
28 for the cases examined by Pathologist 1 and Pathologist 2 and described elsewhere in tliis 
report. In addition, a diunmy marker, 18, was added to the data set. The dummy marker consists 
of integer values from 0 to 12 .selected at random from a uniform distribution. 

(c) Method 1 : Using the R package (version 1.4.1) 
[0284] Computerized model fitting procedures generally cannot deal with missing data. This is 
the case for the ghn (gim stands for generalized Unear model) procedure used in R. Consequently 
when fitting a model using glm it was necessary to exclude all the cases for which there are one 
or more missing values . When fitting the initial full model, containing the 27 real markers and 
the single dummy marker, this reduces the data set to only 202 cases. With so few observations it 
was decided that the best way. to perform variable selection, to train a classifier using the selected 
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variables, and to assess its performance was to undertake 100 trials on random partitions of the 
data into train and teist sets. 

(i) Partitioning the data into train and test sets 
[0285] At-the start of each trial, the data is partitioned into a.test set and a training set. This is 
done by randomly choosing 30% of the abnormals and 30% of the normals to form the test set, 
and using the remaining observations to form the training set. 

(ii) Variable (marker) selection 

[0286] At the start of each trial, the full model, which- includes all of the variables (markers), is 
fitted to the training data. In R the logistic regression model is fitted using glm. The code 
fragment used is as follows: " , 

my. model <- Class XI + X2 + .X2 + X3 + X4 + X5 + X6 + X7 +.X8 + 
X9 + XIO + Xll + X12 + X13 + X14. + XIS + X16 + X17 + X18 + X19 + 
X20 + X21 + X22 + X23 + X24 + X25 + X26 + X27 + X23 

my. glm <- glm (my. model; f amily=binomial (link=logit:) , 
daca=training . data) 

[0287] The procediure stepAIC is then used to perform stepwise variable selection based on the 
Akaike Information Criterion (AIC). This procedure is part of the publicly available MASS 
library. The library, and the procedure are described in ''Modem Applied Statistics with S-PLUS" 
(W. N. Venables and B.D. Ripley, Springer- Verlag, Pathologist 3 New York, 1999). The R code 
fragment to do this is as folIows:^ 

my. step <- shepAIC (my .glm, direct ion=both) 
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[02S8] The resulting model is then assessed on the test data. The code fragment. used is as 
' follows: 

probability_is_abnormal <- 

predict (my .step, testing. data, type=:"response'') 

[0289] The perfonnance of the classifier is recorded in terms of the actual error rate of 
misclassification (AER) and the area under the ROC curve (AUG). After the lOO trials, 100 
. models and their associated AERs and AUCs remain. A frequency table is constructed, recording 
the number of times each variable made an appearance in the 100 models. An example is shown 
in Table 22: 
Table 22: 



Variable 


I 


2 


3 


4 


5 






9 


11 


."V2^-;--v 


14 


Frequency 


2 


6 


4 


I 


4 






3 


10 


^^'^;'.) 


1 


. Variabie 


15 




i7 


18 


\9 


20 




24 




28 




Frequency 


4 




2 


I 


3 


2 


2 ^m<i^ 


10 




4 





[0290] . This table is used to decide which markers to discard. First, all of the markers that have 
a frequency less than or equal to 10 are discarded. Next a cut-off frequency is chosen based on 
the frequency of the dummy marker (typically this is 1 or 1.5 times that of the dummy miarker). 
All markers with a frequency less than this cut-off value are discarded. The remaining markers, 
along with the dummy marker, are then used as the full model for another 100 trials and the 
pruning process is repeated. If necessary, the severity of the pruning can be increased to force 
one or more markers oiit of the model. If necessary, the remaining markers can be used as the full 
model for yet another 100 trials. Pnming stops when the desired number of panel members is 
reached or the average AUG for the current model is less than that for the preceding model. 
[0291] To illustrate the pruning process consider the table above. The table was obtained using , 
the detection panel data. The shaded entries indicate those markers that are. retained after pruning. 
Another 100 trials is perfomed using the following fiill model: 

my. model <- Class-- X6+X7 + X8 + X12 + X16 + X1S + X23+X25 
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(0292] Again, frequency table, Table 23 is constmcted: 



Table 23: 



Variable 










16 


IS 






Frequency . 










30 


47 


..mi-:'.-: 





[0293] The shaded entries show the markers retained after pruning (using a cutoff of 47). 
Another 1 00 trials is performed using the following Hill model: 

my. model <- Class - X6 + X7 + X8 + X12 + X18 + X23 + X25 



[0294] Again, a frequency table. Table 24 is constructed: 
Table 24: 



Variable 




IS 




Frequency j 


=*:J)6H:-: s'-N-10O:-a?.| 23 1^73:::..-.:; 


3 ^.-88: 





[0295] At this point a cut-off of 50 is chosen. The shaded entries show the remaining markers 
for use on a 5 member panel. In each step, the average AUG increases: 94.37% -» 95.45% -> 
95.78%. 

(iii) . Assessing the performance of the panel 
[0296] To assess the performance of the panel, 100 trials were performed, as before, but . 
without the stepwise selection procedure. For each trial, the AUG, sensitivity, and specificity are 
recorded. For the detection panel example above, the results are: 

> ray.raodel <- Class X7 + X25 + X6 + X23 + X12 

> suimnary(AUC) 
Min. 1st Qu, Median Mean 3rd Qu. Max. 

0.9289 0.9590 0.9615 0.9601 0.9630 0.9630 

> suTnmary(sensitivity) 
Min. 1st Qu. Median Mean 3rd Qu. Max. 

0,8519 0.9630 0.9630 0.9737 1.0000 1.0000 

> sunuriary(specificity) 
Min. 1st Qu. Median Mean 3rd Qu. Max. 

0.8378 0.9730 0.9730 0.9749 1.0000.1.0000 
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[0297] La suzninary, the panel has a sensitivity of 97.37% and a specificity of 97.49%. The area 
iinder the ROC is 96.01%, 

(d) Method 2: Using SAS (version 8.2) 
[0298] Logistic regression can be performed in SAS using the procedure LOGISTIC. When the 
response variable is a two-level factor, the procedure fits a binary logit model (equivalent to ghn 
in R with family=binomial and link=logit). SAS automatically excludes all of the missing 
, multivariate observations for the model specified. Unlike R, SAS is able to perform a best 
. subsets variable selection procedure, the code fragment in SAS needed to do this is as follows: 

PROC LOGISTIC DATA«WQHK.panel; 

CLASS Class; 

MODEL Class = XI X2 X3 X4 X5 X6 X7 X8 X9 XIO XI 1 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 
X23 X24 X25 X26 X27 X28 /SELECTION- SCORE BEST=28; 
RUN; 

[0299] Tliis procedure is appUed to the entire data set. The parameter BEST=28 directs SAS to 
fmd the best 28 single-variable models, the best 28 two-variable models, the best 28 three- 
variable models, up to the best 28 2S-variabIe models, 

(i) Assessing the performance of the panels 
[0300] The procedure described in method 1 is used to assess the performance of each of the 
panels. The following , Table 25,was generated from the detection panel data. It lists results only 
for the two best one-, two-, three-, four-, and five-marker panels. 
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Panel 


Panel members 


Sensitivity 


Specificity 


Area under ROC 


1 


7 






94.28% 


2 


28 






80.14% 




7 1 fS 






ys.uu/o 


4 


7, 15 






94.59% 


5 


7,15,16 






95.94% 


6 


1,7, 16 






95.33% 


7 


1,7, 15, 16 






95.61% 


S 


4, 7, 15, 16 






95.34% 


9 


1.4.7,15,16 






95.30% . 


10 


1,7, 11, 15, 16 






95.57% 



(2) Linear Discriminant Analysis 
(a) Background 



[0301] The commercial statistical package SPSS has procedures allowing simple linear 
discriminant functions to be design and tested. 

[0302] - A commonly used method is Fisher's Linear discriminant function. This finds the 
hyper-plane in feature space which gives a good separation of classes. For a two class problem 
where the class distributions have different means, but similar multivariate Gaussian 
distributions, this classifier gives optimum performance. The method can be extended 
heuristically to multi-class problems, but this was not applied in the study. 
[0303] The method is simplistic in its approach but robust to problems associated with data sets 
containing a large number of features (the probes in our case number 27, giving problem for a 
data set comprising only some two hundred exemplars (cases)). 

[0304] This package has a procedure for identifying the features which contribute well to the " 
discrimination process. This "stepwise method'' first finds the most discriminating feature. Otlier 
features are then sequentially added and evaluated against the classifier. Combinations are 
explored so the final solution may exclude features initially selected if better combinations are 
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found. The number of features is gradually increased until a statistical test shows the remaining 
features do not contribute reUably to the classification process. 

[0305] An estimate of the perfotmance is gained by using the leave one out method. This 
/removes one sample from the data set to form the training set. The left out sample is retained as 
the test set, applied to the classifier, and the resulting classification accumulated in the confiision 
matrix. The procedure is repeated for case in the data. This, procedure gives an unbiased estimate 
of performance, but the estimate will have a high variance. • 

[03061 In SPSS select the appropriate data set for analysis, select "xAjialyze", select "Classify", 
select "Discriminant...", on the table select "Fishers method", "leave one out testing" and "use 
stepwise method". Enter the diagnosis as the grouping variable and enter all the features as the 
independents. Enter 'OK" to complete the analysis. Pre-set values for other parameters were left 
asset. 

[0307] The analysis output includes a list of the features used in the analysis, the canonical 
discriminant function and a confusion matrix and the correct-classification rate (1 -error rate). 
[0308] In order to compute an ROC curve the Canonical discriminant fimction is applied to the 
selected features to generate a new feature. In SPSS use Graphs, ROC to plot this curve, 
ii. Hierarchical methods: Decision trees . 
(1) Background 

[0309] Decision tree learning is one of the most widely used and practical methods for 
inductive inference. It is a method for classification that is robust to noisy data and capable of 
learning disjunctive, expressions (Torn M. Mitchell, "Machine Learning", McGraw-Hill, New 
York, NY, 1997). 

[031 0] The most popular and accessible machine learning package is "C4.5" the source code of 
which is published in: (J. Ross Quinlan, "C4.5: Programs for Machine Learning", Morgan 
Kaufmann, San Mateo CA, 1993). 

[0311] When a decision tree is being trained (on training data), the algorithm decides at each 
node of the tree which single attribute of the data to use at' this node to best make a decision. 
Therefore when the tree is completely constructed, it will have selected some set of attributes to 
use and ignored others. In our appUcation , usmg decision trees to process measurements gained 
from molecular probes, the decision tree has effectively chosen a panel of probes, and a method 
of combining the probe scores, which best explains the classification of the data. To obtain an 



92 



wo 2004/025251 



PCT/US2003/028379 



unbiased estimatie of the panel perfonnance, the resulting tree must be evaluated on data which 
-was not used in the training. One standard technique for doing this is cross-validation. A 1 0-fold 
cross-validation was employed. 

[0312] Cross-validation is a technique for making the very best use of limited data. In lO^fold 
cross-validation the data is randomly split into 10 nearly-equal sized partitions, taking care to 
have approximately the same number of cases in a class across each partition. Then, the decision 
tree is trained on partitions 2-9 combined and tested on partition 1,. then trained on partitions 1,3- 
9 combined and tested on partition 2; and so on for 10 trials rotating the held-out test set through 
the data once. In this nianner tests are only ever performed on held-out data and so are imbiased, 
and all data is tested exactly once so an aggregate error rate across the whole data set can be 
computed, 

[0313] Trees are usually constructed until they are a very good fit to the training data, then they 
are *'pruned" back by cUppmg off "noisy" branches and leaves. This improves the generalization 
ability of the decision tree on unseen data and is esisential to obtain good perfonnance. The C4.5 
package includes two methods for pnming trees first a standard tree pruning algoriUim,. second a 
rule extraction algorithm. In general, the tree based method was found to give superior results on 
this data. Therefore, the. rule-based method is not reported. 

(2) Data Preparation 

[0314] . Data on the response of various probes to normal tissue and five different cancers 
(Adenocarcinoma, Large Cell Carcinoma, Mesothelioma, Small Cell Lung Cancer, and 
Squamous Cell Carcinoma) was obtained as described elsewhere. The H-scores for probes 1-28, 
and pathologists Pathologist 1 and Pathologist 2 were extracted from the database and put into a 
flat data file. For the decision tree analysis each data point (even by two pathologists on a same 
physical slide) was taken to be an independent observation of the effect of disease on staining. 
This may slightly positively bias the performance of classification but should have no effect on 
panel selection. 

• The control categories of Emphysema, Granulomatous Disease, and Interstitial Lung 
Disease were grouped together and called '"Normal''. 

• For the detection panel all the cancers were grouped together and called "Abnormal" 
making this a 2-class problem. 



93 



wo 2004/025251 



PCT/US2003/028379 



• For the single discrimination panel, the Normal cases were removed from the data to forai 
. a 5-class problem. 

• . For the hold-out. discrim,ination panels, each cancer was held out in turn and the 

remaining cancers grouped into "Other" to give a set of.five. 2-class problems. 
[0315] C4.5 requires a ".names" file which describes the data and the attributes to be included 
in the analysis. An example names file for the discrimination panel is, Table 26: 
Table 26: 



1 C4.5 Names file for MonoGen ZF21 diag data 

r. 

Adenocarcinoma, Large Ceil Carcinoma, Mesothelioma, . Small Cell Lung Cancer, 
Squamous Cell Carcinoma, | classes* 



PI 


continuous . 


P2 


continuous . 


P3 


. continuous . 


P4 . 


continuous . 


P5 


continuous - 


P7 


continuous . 


P8 


continuous . 


P9 


continuous 


PIO 


continuous . 


Pll 


continuous , 


?12 


contiauous . 


P13 


continuous . 


P14' 


continuous . 


P15 


continuous. 


P16 


continuous . 


P17. 


continuous . 


P18 


ignore . 


P19 


■continuous . 


P20 


. .continuous . 


P21 


continuous . 


P22 


continuous . 


P23' 


continuous . 


P24 


continuous . 


P25 


continuous . 


P26 


continuous . 


P27 


continuous . 


P28 


continuous . 



[0316] Probe 18 was missing &om the data and was set to "ignore" in all the designs. Setting 
attributes to "ignore" in the names file is an easy and effective way of trimming probes fi-om the 
panels and is used in the data analysis. . 
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(3) Data Analysis 

[0317] Ten-fold cross validation was mn on each data set using the ' xvai . sh" script supplied 
with C4.5. Standard (default) parameters for the package were used. Cross validation is a 
technique developed for classifier training and testing on small data" sets. It involves randomly 
splitting the data into N equal sized partitions. The classifier is then trained on N-1 partitions 
together and tested on the remaining partition. This is repeated N times. 
• [0318] Since the decision tree trained in one cross-validation(CV) trial may differ from the tree 
obtained in anodier (different in both probes selected, and tree coefficients) the number of times 
each probe was selected by the tree in 1 0 trials was computed. 

[0319] The first cull of probes was done by setting to ignore any probe which did not occur in a 
pruned tree 5 or more times out of the 10 CV trials. 

. [0320] Then the cross-validation was repeated with this smaller set of candidate probes. The 
second cull of probes was done by setting to ignore any probe which did not occur in a pruned 
tree 5 or more times out of the 10 CV trials. If any fiirther probes dropped out, a third CV run 
was done. 

[0321] The panels were selected by the various runs, and their estimated error performance are 
shown in the results tables. The panel performance for decision tree analysis is shown below, in 
Table 27. 
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Table 27: 

Panel Performance - Decision Trees 



Pair-wise Discrimination 
4, 6, 14. 19 and 23 



Pair-wise Discrimination 
3, 6. 17. 19 and 25 



Pair-wise Discrimination 
1,5/10.13, 21,27 and 28 



Pair-wise Discrimination 
3, 12 and 16 



Pair-wise Discrimination 
12.17, 20. 23 and 25 



Detection (without probe 7) 
6, 10. 16 and 19 







Cancer 


Control 


Detection Panel 


Cancer 


99.42% 


0.58% 


Probes: 3.7, 19, 25 and 23 


Control 


17.82% 


82.18% 





Adeno 


Others 


Adeno ■ 


87.74% 


32.26% 


Others 


11.20% 


88.80% 





Squamous 


Others 


Squamous 


70.59% 


29.41% 


Others ' 


4.07% 


95.93% 





Large Cell 


Others 


Large Cell 


36.36% 


. 63.64% 


Others 


7.37% 


92.63% 





Mesothelioma 


Others 


Mesothelioma 


82,05% 


17.95% 


Others 


5.00% 


95.00% 





Small Cell 


Others 


Small Celi 


. 69.23% 


30.77% 


Others 


1.49% 


98.51% 





Cancer 


Control 


Cancer 


89.60% 


10.40% 


Control 


3.30% 


96.70% 







Cancer 


Control 


Detection (only commercially preferred probes) 


Cancer 


92.80% 


7.20% 


5.6, 10, 16. 19 and 23 


Control 


5.49% 


94.51% 



(0322] An example decision tree stmcture is shown in below, in Tables 28 and 29, for 
discriminating between Small Cell Limg Cancer and the remaining four types of cancer. 
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Table 28: 

C4.5 output fonnat: 



P23 <= 3 : 

I P25 .<= 2 : Small Cell Lung Cancer (18.0) 
I P25>2: 

I 1 P17 <- 5 : Small Cell Lung Cancer (2.0) 

I 1 P17 > 5 : ■ 

I 1 I P20 <= 11 : Other (9.0) 

1 I I P20> 1 1 : Small Cell Lung Cancer (2.0) 

P23 > 3 : 

I P12>7 : Otlier( 120.0) 
I ■P12<=7 : 

I I P20<=2 : Other (5.0) 

I I P20 > 2 : Small Cell Lung Cancer (4.0) 



Tree saved 



Evaluation on training data (160 items): 
Before Pruning After Pruning 



Size Errors Size Errors Estimate 
13 0(0.0%) 13 0(0.0%) (5.2%) 
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Table 29: 
Pictorial foanat: 





<=3 




Prob»23 

>3 










Probft25 










Probal2 

Cs7 




1 . . 

Small C»U Lung Cancsr 




Probe 17 
>5 




1 

Other 




. <=2 . 


Probo 20 
>2 


1 . 

Smali CaQ Uung Cancor 








. 1, 

Othsr 


... 1 

Snj^lCat! Lung Canoar 






<=11 


>n 










■ 1 

Othv 




















Small C«(l Lung Cancer 







[0323] The panel perfonnance for stepwise linear discriminant is shown below, iii Table 30: 
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Table 30: 

Panel Performance - Stepwise LD 

Detection Panel 
1,4. 7. 15 and 16 



Pair-wise Discrimination 
1.2.3, 24. 25 and 26 



Pair-wise Discrimination 
Land 7 



Pair-wise Discrimination. 
.3. 12 and 16 



Pair-wise Discrimination 
12. 19. 22 and 23 





Cancer . 


Control 


Cancer . 


92.24% 


7.76% 


Control 


1.16% 


98.84% 







Adeno 


Others 


Pair-wise Discriinination 


Adeno 


91.67% 


8.33% 


4. 5. 14. 19, 20. 25 and 27 


Others 


5.43% 


94.57% 





Squamous 


Others 


Squamous 


88.00% 


12.00% 


Others 


6.59% 


93.41% 





Large Cell 


Others 


Large Cell 


80.95% 


19.05% 


Others 


26.32% 


73.68% 





Mesothelioma 


Others 


Mesothelioma 


96.67% 


3.33% 


Others 


4.65% 


95.35% 





Small Cell 


Others 


Small Cell 


93.75% 


6.25% 


Others 


5.00% 


95.00% 



Detection (without probe 7) 

1 , 2. 3. 4. 1 0. 1 1 . 15.16.23, 24. 27 and 28 





Cancer 


Control 


Cancer 


85.34% 


14.66% 


Control 


2.33% 


97.67% 







Cancer 


Control 


Detection (only commercially preferred probes) 


Cancer 


81,20% 


18.80% 


8. 10. 11, 19. 23 and 28 


Control 


1.16% 


98.84% 



[03241 Ttie panel performaace for stepwise logistic regression analysis is shown below, in 
Table 31: 
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Table 31 : . 

PaneJ Performance - Stepwise LR 

, Detection Panel 
6. 7, 12. 23 and 25 



Pair-wise Discrimination 
1.4, 6, 16 and 21 



Pair-wise Discrimination 
3. 7, 12 and 16 



Pair-wise Discrimination 
12, 13 and 23 



Detection (without probe 7) 
1,10, 19, 23 and 28 





Cancer 


Control 


Cancer 


97.49% 


2.63% 


Control 


2.51% 


97.49% 







Adeno 


Others 


Pair-wise Discrimination 


Adeno 


96.39% 


, 3.61% 


14, 19. 20. 25 and 27 


Others 


12.29% 


87.71% 







Squamous 


Others 


Pair-wise Discrimination 


Squamous. 


94.93% 


5.07% 


3 and 10 . 


Others 


35.86% 


64.14% 





Large Cell 


Others 


Large Cell 


'95.11% 


• 4.89% 


Others 


61,00% 


39.00% 





Mesothelioma 


Others 


Mesothelioma 


95.07% 


4.93% 


Others 


10.89% 


89.11% 





Small Cell 


Others 


Small Cell . 


98.90% 


1.10% 


Others 


4.00% 


96.00% 





Cancer 


Control 


Cancer 


94.00% 


6.00% 


Control ■ 


5.80% 


94.20% 



Detection (only commercially preferred probes) 
10. 19, 20, 23 and 28. 





Cancer 


Control 


Cancer 


93.88% 


6.12% 


Control 


6.39% 


93.61% 



iii. . Neural networks and alternative methods 

[0325] Artificial neural networks ANN'S are candidate pattern recognition techniques which 
could readily be applied to select features and design classifiers in association with this 
invention. However such techniques give little insight to the structure of the data and the 
influence of particular probes in the v/ay that LDF giyes. For this reason tliis class of algorithm 
was not used in this study. LDF stands for linear discriminant flmction, a linear combination of 
features whose result is thresholded to determine the classification. 
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[0326] This class of techniques includes, algorithms such as Multi-Layer Perceptroii MLP, 
Back-Prop, Kohonen's Self-Organizing Maps, Learning Vector Quantization, K-aearest . 
neighbors and Genetic Algorithms.- . 

iv. . Special topics 

(1) Assumptions 
Linear discriminant analysis 

• Assumes the co variance matrices for the two classes are equal. 

Minimizes the cost of misclassification only when the two classes are multivariate 
normal. 

• Assumes that the explanatory variables are continuous rather than categorical (in 
this shidy, the H-sqores are categorical while in practice (i.e., in an automated 
system) intensity can be measured on a continuous scale). 

• Logistic regression (binomial generalized linear models) 

[0327] See Venerables and Ripley, chapter 7 ("Modem Applied Statistics with S-PLUS" (W.N. 
Venables and B.D-. Ripley, Springer- Veriag, New York, 1999)). 

(2) Marker rejection (de-selection) 

[0328] Computerized implementations of discriminant analysis and regression procedures 
. include stepwise variable selection procedures; e.g., stepAIC in R. These procedures are designed 
to select the best subset of variables for use as explanatory variables. In reality, because of the 
step-by-step nature of these procedures, there is no guarantee that the best variables are selected 
for prediction (Johnson and VVichem, p. 299). Nevertheless such procedures do provide the basis 
for marker selection and de-selection. 

(3) Pairwise tests 

[0329] Inlierent problems in designing multiclass classifiers is discussed in " AppUed 
Mulitvariate Statistical Analysis", R. A. Johnson and D. W. Wichem, 2nd Ed,1988j Prentice- 
Hall, N.J. This is motivation for developing several separate two-class classifiers (discrimination 
panel). 

(4) Redundancy consideration in panel composition 

[0330] "Linear models form the core of classical statistics and are still the basis of much of 
statistical practice" "Modem Applied Statistics with S-PLUS" (W.N. Venables and B.D.' 
Ripley,Springer-Verlag, New York, 1999). Linear models are the foundation for the t-t;est. 
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. analysis of variance (ANO VA), regression analysis, as well as a variety of multivariate methods 
including discriminant analysis. Explanatory variables may or may not enter the model as first- 
order terms. This is true also of (non-hnear) logistic regression. The logistic regression model is 
simply a non-linear transformation of the linear regression model: the dependent variable is 
replaced by a log odds ratio (logit). In summary these statistical methods are based on linear 
relationships between the explanatory variables' Consequently, one avenue for seelcing 
redundancy in panels is to identify highly correlated variables (markers). It may be possible to • 
replace one marker with the other in a panel to achieve similar performance, 
[0331] Another avenue for seeking redundancy in panels is to undertake a "best subsets" 
regression analysis. Given a starting model with all of the explanatory variables of interest, the 
aim is to find the best single-variable regression models, the best two-variable regression, etc. 
This methodology is implemented in the S AS statistical package, 

(5) Use of weighting scores 

(a) Commercial and clinical considerations 

[0332] For many reasons, including strategic and commercial factors; cost; availabihty; ease of 
use, it may be preferred to encourage the selection of certain probes in a panel and penalize the 
selection of others, at the same time trading this off against panel size or performance. 

(b) Attribute costing 

[0333] Methods for such attribute weighting (in decision trees) have been proposed in die 
machine learning literature in other contexts such as the incorporation of background laiowledge 
(M. Nunez, ''The Use of Background Knowledge", Machine Learning 6: 231-250, 1991.), and 
the differential cost of obtaining information firom robotic sensors (M. Tan, "Cost-sensitive 
Learning of Classification Knowledge and its Apphcations in Robotics", Machine Learning. 13: 
7-33,1993). 

[0334 1 Bodi of these cost-sensitive algorithms have been implemented in the literature by 
minor changes to the standard machine learning software package known as "C4:5 (J. Ross 
Quinian, "C4.5: programs for machine learning", Morgan Kaufinann, CA. 1993). For 
convenience, tliis approach was followed to implement the "EG2" algorithm of Nunez. 
[0335] In die C4.5 decision tree construction phase, the algorithm compares each available 
attribute to split on and chooses the single one which maximizes the information gain, Gi In tlie 
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£G2 algorithm, (2^' -1)/.(C/ + 1) is maximized which incorporates the cost of information for 
attribute /, Cl The vector of weights need Co be set a priori by tiie user, 

(i) Code Modifications 

[0336] The C4.5 source code was modified to implement the economic generalizer "EG2" 
algorithm proposed by M. Nxmez (The Use of Background Knowledge, Machine Learning 6: 
231-250,1991), 

[0337] The exact modifications to the C4.5 package are as follows. 

After the following lines in file ^^R8/Src/contin.c". (J, Ross Quinlan, "C4.5: programs for 

machine learning", Morgan Kaufinann, CA. 1993). 

ForEachd, kp, Lp - 1) . 
{ 

if ( (Val « SplitGain[i) - ThreshCost) > BestVal ) 
{ 

BestI = i; 
BestVal = Val; 

} 

} ■ . • ' ■ 

The new line: 

BestVal ^ (powf(2.0, BestVal) - I.Q) /■ (AttributeCosts [Att] + I.O); . 

is inserted. Where the vector of attribute costs has been previously read in from a text file 
maintained by the user 

(ii) Experimental Methodology. 
[0338] The commercially prefenred probes are: 2,4,5,6,8, 10,11, 12,16,19,2.0.22,23,28. 
[0339] For the sake of example, suppose the above probes are commercially prefenred due to 
cost and it is desired to reselect the detection panel taking this cost into account. 

[0340] The modified C4;5 decision tree software was used to give the commercially prefenred 
probes a penalty of zero andnon-commercially preferred probes a penalty of two. The 10-fold 
cross validated panel selection methodology (as described elsewhere) was run using the modified 
C4,5 algorithm. 

(iii) Results . ' 

[0341] The standard decision tree detection panel consists of probes 3, -7, 19, 25, 28. Resulting 
Panel Members: are 2, 6, 7, 10, 19, 25, 28 which used only 2 commercially preferred probes, P7 
and P25. Note these probes have been selected by the method in spite of their increased cost due 
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to their superior performance on this data. The panel is now larger: 7 probes versus 5 originally. 
There is no demonstratable drop in panel performance on this data although" the perfomance will 
now be sub-optimal as a trade off against the reduced cost of probes. 

(iv) Conclusion 

[0342] A straightforward way has been established for incorporating costs of using probes into 
the panel selection methodology. 

(c) Misclassification costing 

(i) Background 

[0343] For many reasons it may be desired to select an optimal panel bearing in mind that the 
costs of the different kinds of classification errors may vary. For example, it may be desired to. 
select a panel which has an increased sensitivity to one disease (say Large Cell Carcinoma) and 
be willing to trade this off against reduced specificity and sensitivity elsewhere in the confiision 
matrix. 

[0344J . In dieory a matrix of misclassification costs (of the same dimensions as the confiision . 
matrix) to incorporate all the possible combinations of costs may be needed. In practice, only 
those costs which are non unity (the default) are entered. 

[0343] The conmiercial decision tree software See5. (RuleQuest Research Pty Ltd, 30 Athena 
Avenue ,St Ives Pathologist 3SW 2075, Australia, (http://wv\rw.rulequest.com)) mcorporates this 
capability and was used in the following demonstration. 

(ii) .Aim 

[0346] The standard joint discrimination panel (described elsewhere) consists of the members: 
P2, 3, 4, 5, 12, 14, 16, 19, 22, 23, 28. And gives the following estimated confiision matrix: 



■ (a). 


(b) 


(c) 


(d) 


(e) 


< -c 


lassified as 


24 


4 


2 


5 


2 


(a) 


: class Adenocarcinoma 


8 


7 


3 


5 


4 


(b) 


: class Large Cell Carcinoma 


1 


1 


33 


1 


4 


(c) 


: class Mesothelioma 


.6 


2 


1 . 


23 




(d) 


: class Small Cell Lung Cancer 


4 


4 


3 


2 


24 


(e) 


:, class Squamous Cell Carcinoma 



[0347] The sensitivity of Large Cell Carcinoma is low at 26 percent. If one wished to increase 
this sensitivity in a newly designed panel, the following method may be employed. 
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. (iii) Methodology 
[0348] The followmg costs file was generated: 

1 costs file for ZF21Discrim ' • 

I. ■ . ■ 

t Irxrease sensitivity for "Large Cell • Carcinoma" 

i " ■ 

Mesothelioma, Large Cell Carcinoma: 10 . 
Adenocarcinoma, Large Cell Carcinoma: 10 
Mesothelioma, Large Cell Carcinoma: 10 
Small Cell Lung Cancer, Large Cell Carcinoma: 10 

Squ amous Cell Carcinoma, Large Cell Carcinoma: 10 \ 

[0349] This file up weights the misclassification of Large Cell Carcinoma as any of the other 
cancers by a factor of 10. This will tend to increase the sensitivity of detection in this class (with 
reduced performance elsewhere) but no weighting can ensure peirfect classification. 
[03S0J The standard decision tree panel selection methodology was applied (using SeeS instead 
of C4.5). 

(iv) Results 

[0351] The new panel members are: P2, 3, 4, 5, 6, 9, 12, 14, 16, 17, 25, 28. With an estimated 
performance of: 



(a) 


(b) 


(c) 


(d) 


(e) 


<-classified as 


20 


13 


1 


1 


2 


(a) : 


class Adenocarcinoma 


3 


. 13 


3 


2 


6 


(b) : 


class Large Cell Carcinoma 


1 


9 


27 


2 


1 


(C) : 


class Mesothelioma 


2 


9 


21 




(d) : 


class Small Cell Lung Cancer 


1 


15 


2 


1 


18 


(e) : 


class Squamous Cell 



Care inoma . . .• • - • 

[0352] The above demonstrates that the estimated sensitivity of Large Cell Carcinoma has now 
increased to 48%; 

(v) Conclusion 

[0353] A straightforward way has been demonstrated for incorporating the differential costs.of 
misclassification into the panel selection methodology. 

d. Performance metrics 
[0354] Outputs provided.by the analysis indicating the estimated performance of each method . 
inckide: 
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i. ROC analyses 

[0353j Receiver Operating Characteristic (ROC) curves show the estimated percentage (or per 
unit, probability) of false positive and false negative scores for different threshold levels in the 
classifier. An indifferent classifier, unable to discriminate better than random choice, would 
present a ROC curve with equal true and false readings. The area under this curve would be 50% 
(0.5 probability). 

[0356] Area Under the Curve (AUC) is often used as an overall estimate of classifier 
performance and most commercial discriminant function packages compute this figure. A perfect 
classifier would have 100% Area Under the Curve, a useless classifier would have an AUC near 
50% (0.5 probability). 

ii. Confiision matrices: counts iand percentages 

[0357] Confusion matrices- show how data from the test set was classified. For pair wise tests 
these are counts of true positive, false positive, true negative or.false negative scores. These may 
be shown as. actual counts or as percentages. For the multi-way Panel, which attempts to giye a 
unique diagnosis with one panel only, the confusion matrix would show counts for each correct 
classification. For instance, each time Small Cell Ciarcinoma is detected as such it would be , 
entered in one diagonal of the matrix. Incorrect scores; for instance, how ojften a small cell 
carcinoma is incorrectly identified as squamous cell cancer would be entered in the appropriate 
off-diagonal element of the matrix. Error Rates are used to summarize data in the confusion 
matrix as the sum of all false classifications divided by the total number of classifications made, 
expressed as a percentage. 

iii. Sensitivity and specificity 

[0358] Specificity refers to the extent to which any definition excludes invalid cases. If a - 
definition has poor specificity, it is high in false positives. This means that it labels individuals 
as having a disorder when tlaere is really no disorder present. . Sensitivity refers to the extent to 
which any definition includes all valid cases. If a definition has poor sensitivity, it is high in 
false negatives (individuals who have a disorder present are falsely being diagnosed as not 
having one). 
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3. Data analysis and results 

a. Sample size and variability 

• Of the 354 cases in the combined Pathologist 1 and Pathologist 2 data set. only 202 cases 
possessed an H score for every marker (variable or feature). 

The small number of complete observations and the large number of variables leads to 
estimation problems (curse of dimensionality). Hence it is necessary to prune severely 
back the niunber of Variables used to build a classifier. 

Due to the small number of observations it is not prudent to . divide the data into separate 
training and testing sets (necessary for the robust estimation of classifier performance). 
For this reason, it was necessary to use resampling methods (such as cross- vahdation and 
multiple random trials). 
. • The design of a multiclass classifier for cancer discrimination is difficult because there 
are so few observations for each type of cancer. 

b. De-selected markers 

[0359] Markers were de-selected using the methodology described above. Markers that were 
de-selected.are represented by non-selection in.the panels. 

c. Detection panel(s) composition 

i. Selected marker probes 

[0360] The selected marker probes for all three methods are summarized in Figiu-e 5. 

ii. Minimum selected marker set 

[0361 ] For the detection panel it is clear that.probe 7 delivered the best detection performance 
for a single marker, Conxbinations of probes were analyzed to see ff a reliable, panel could be 
obtained with more probes. 

. (1). Method 

[0362] The Logistic Regression metliod allows best subsets to be ranked in terms of a 
performance measure (Fisher'score).. This analysis was used to select the combinations firom 1 
through 5 probes. Fishers linear discriminant fiifiction and logit models (logistic regression) were 
used to illustrate the performance of these combinations. Data shovm above. . 

(2) Conclusions 

[0363]' Probe 7 performs well on its own as a classifier; however, a drawback to usiiig probe 7 
alone is that probe 7 has a high false negative score. The best performance using Fishers linear 



107 



wo 2004/025251 



PCT/US2003/028379 



discriminant fimction as a classifier was with probes 7 and 16. The variability of results amongst 
panels using other combinations suggests the noise added by more features is outweighing any 
potential to improve classification scores. The small number of incorrectly scored samples gives 
a poor repiresentation of the statistics of these rarer events. A classifier designed with a larger . 
number of cases may allow abetter classifier to be designed. Techniques to select best 
combinations of probes using different classifiers may produce a different best panel, depending 
on the structure of the data. 

iii. Supplemental markers 

[0364] It is shown that panels can be designed to suit the availability of different probes. 
Different methodologies can be used for selecting these subsets: Decision Trees, Logistic 
Regression, and Linear Discriminant Functions. Data are shown above. 
[0365] Using SPSS a Fisher's Linear Discriminant function was applied to the scores obtained 
from the panel in which constrains were applied due to access constraints^ For example, all of 
the probes come from one vendor. Again, the stepwise option was selected to find the best 
combination of featiu-es. Performance was estimated using the Leave-One-Out cross validation 
test" 

iv. Alternative markers: biological mechanisnis of action (fimctionally 
equivalent markers) 

[0366] • A person of ordinary skill in the art is able to determine fimctionally equivalent 
markers. The fi.mctional behaviors of the markers used in the panel are described throughout this 
document. 

V. Marker localization 
[0367] The localizations of the various markers used in this study are described elsewhere in 
this document. ■ 

vi. . Panel Performance 

[0368] The performance of the three methods is shown above. 

vii. Limitations on interpretation of panel performance 

Due to small data set and the need to employ resampling methods, there is the danger that 
the classifiers have been over-trained (made to fit the data too closely). 
The panel performance using cytology .specimens is difficult to forecast accurately since 
it is not clear whether sputum cytology samples will contain adequate numbers of cells 
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that are representative of the cells analyzed in the histological validation studies. 
Nevertheless, given an adequate cellular sample size, one would expect the optimized 
panel to behave similarly with cytological specimens, 
d. Discriminant Panel CompositioQ 

i. A single 5-way panel for all cancers 

[0369] ■ Of the three analysis techniques, only a decision tree is amenable to a single 5-way 
panel. A single decision tree, was therefore constructed to. simultaneously classify all types of 
lung cancer. The panel members are shown Figure 5. The panel performance is shown above in 
the panel performance tables. 

ii. Panels for discriminating a single type of lung cancer against all 

. * others 

[0370] Linear discriminant functions are not well suited to performing simultaneous multi-class 
discrimination. The performance of five separate, classifiers, each designed separately to 
discriminate one of the cancers from a pooled set of all the cancers, was analyzed. Such 
combinations have the pbtential to classify none of the cases as having one of the candidate 
cancers, or classify a single case as having two or more of the candidate cancers. This has a 
potential advantage in identifying inconsistent cases for further review. 
[0371] It has been seen that the overall error rate of a single discriminant panel for all cancer 
types has a fairly high error rate (a five way classifier). In the panel performance data shown 
above, the performance of five pair- wise classifiers, each designed to identify one cancer from 
the four other possible cancers is shown. This approach is amenable to analysis using Decision 
Trees, and Linear iSiscriminant functions. The technique hais the potential to- deliver an 
ambiguous finding when applied, giving two or more diagnoses for a single patient, suggesting 
fiirther clinical investigation. The technique has the potential to deliver no finding, again 
suggesting further investigation (perhaps a re-test with the detection panel). 

iii. Panels to account for possibility of false positive cases from 
detection panels 

[0372] A further panel can be trained to discriminate among the false positive cases (from. the 
detection panel) and the five cancer types. This involves selecting tliose individual cases from 
the detection panel that were incorrectly classified as abnormal. This trains a dedicated classifier 
on the 'harder' problem of detecting these 'special' cases. However, while this is a theoretically 
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sound task, the data set only yielded four of these cases and the population was deemed to be 
under-represented for analysis. 

iy. Selected Markers 
. [0373] The selected marker probes for all tliree methods are summarized in Figure 5. 

V/ Minimiun selected marker set ' ' , 

(03741 This topic is addressed below under "Robustness of Approach Demonstrated by Similar 
Results Using Different Methods." 

vi. Supplemental markers 

[0375] This topic is addressed below under "Robustness of Approach Demonstrated by Similar 
Results Using Different Methods." 

vii. Alternative markers: biological mechanisms of action 
[0376] A person of ordinary skill in the art is able to determine functionally equivalent 
markers. The functional behaviors of the markers used in the panel are described throughout this 
document. 

yiii. Marker localization 
[0377] . The localization of the various markers used in this study are described throughout this 
document. 

ix. Panel Performance 
[0378] The performance of the three methods is summarized in Figure 5, 

e. Effect of weighting parameters 
[0379] In addition to user-supplied weighting criteria for markers and also for disease states 
(classes) as discussed earlier, one can also use a binary weighting scheme. For example, if all 
non-DAK.0 supplied probes are weighted **0" and all DAKO-supplied probes are weighted "1", 
then the optimized panel will contain only DAKO-supplied probes. This is an important product 
design capability for any vendor who intends to develop and market molecular diagnostic panel 
kits using only their supplies. 
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f. Effect of using other (non H-score) objec?tive scoring parameters 
i. Background 

[0380] The Pathology Review sheet contains a set of boxes as follows, in Table 32:' 
Table 32: 



Intensity 


None 


Weak 


Moderate 


Intense 


0-5% 


□ 0 


□ 0 


□ 0 


□ 0 


6-25% 


□ 1 


□ 1 


□ 1 


□ 1 


26-50% 


□ 2 


□ 2 


□ 2 


□ 2 


51-75% 


□ 3 


□ 3 


□ 3 


□ 3 


>75% . 


□ 4 ■ . 


□ 4. 


□ 4 


□ 4 



[0381] The standard scoring system uses the "H score" which is obtained by grading the. 
intensity as: none = 0, weak = 1, moderate = 2, intense = 3, and the percentage cells as: 0-5% = 0, 
6-25% = 1, 26-50% = 2, 51-75% = 3, >75% = 4, and then multiplying the two grades together 
For example, 50% weakly stained plus 50% moderate stained would score 10 = 2x2 + 2x3. 
ii. Method 

[03 82 J An alternative scoring method was analyzed in which the response was divided into 
low, medium and high as follows: 

(a) if more than 50% of cells had nioderate or above stain HIGH 

(b) if more than 50% of cells had no stain — ► LOW 

(c) otherwise iVIEDIUM 

[0383] The decision tree detection panel selection methodology was repeated using this 3-level 
factor instead of H-score. This caused the tree to split into 3 branches at each node, if required. 
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iii, Results 

[0384] . The panel selected was: Probes 3, 7, 10. 11. 16, 19, 20, 28. With an estimated 
performance, of: 



Classified as -> 


(a) 


(b) 




Control (a) 


7P 


22 


Specificity =78% 


Cancer (b) 


24 


149. 


Sensitivity = 8$% 



This should be compared to the reference performance with H-scores of: 



Classified as -> 


(a) 


•(b) 




Control (a) 


85 


6 . 


Specificity = 93% 


Cancer (b) 


5' 


120 


Sensitivity = 96% 











iv. Conclusions 

There is a. substantial loss of performance (larger panels, lower sensitivity and lower 
specificity) when the proposed alternative scoring system is used. 

• Treating the H-score as a continuous variable (in the range 0 to 12) seems to be near 
optimal for panel selection on the data examined. 

• . The many other possible scoring systems have not been examined, but may be feasible 

and applicable to the experimentally tested panel design and development methodology. 

4. Lung Cancer Detection and Discrimination Panels 
[0385] Listed below are exemplary lung cancer detection and discrimination panels deteimined 
by the above illustrative example. It is noted that although the panels listed below recite specific 
probes, each specific probe may be substituted by a correlate probe or a fluictionally related 
probe. 

Detection (No Constraints) 

anti-Cyclin A combined with one or more additional probes. 
anti-Cyclin A, anti-human epithelial related antigen (MOC-3 1). 
anti-Cyclin A, anti-ER-relatedP29. 
. anti-Cyclin A, anti-mature surfactant apoprotein B. 
anti-Cyclin A, anti-human epithehal related antigen (MOC-3 1), anti-VEGF. 
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anti-Cyclin A, anti-human epithelial related antigen (MOC-31), anti-matiire surfactant . 
apoprotein B.. 

anti-Cyclin A, anti-matiire surfactant apoprotein B, anti-human epithelial related antigen 
(MOC-31), anti-VEGF. 

anti-Cyclin A, anti-mature surfactant apoprotein B, anti-human epithelial related antigen 
(MOC-3 1),' anti-surfactant apoprotein A. 

anti-Cyclin A, anti-mature surfactant apoprotein B, anti-human epithelial related antigen 
(MOC-31), anti-VEGF, anti-surfactant apoprotein A, 

anti-Cyclin A, anti-mature surfactant apoprotein B, anti-human epithelial related antigen 
(MOC-31), anti-VEGF, anti-Cyclin Dl. 

anti-Cyclin A, anti-human epithelial related antigen (MOC-3 1) combined with one or 
more additional probes. 

anti-Cyclin A, anti-ER-related P29 combined with one or more additional probes. 
anti-Cyclin A, anti-matiu'e surfactant apoprotein B combined with one or more additional 
probes. 

anti-Cyclin A, anti-human epithelial related antigen (MOC-31), anti-VEGF combined 
with one or more additional probes. 

anti-Cyclin A, anti-human epithelial related antigen (MOC-3 1), anti-mature surfactant 
apoprotein B combined with one or more additional probes. 

anti-Cyclin A, anti-mature surfactant apoprotein B, anti-human epitheUal related antigen 
(MOC-3 1), anti-VEGF combined with one or more additional probes. 
anti-Cyclin A, anti-maliire surfactant apoprotein B, anti-human" epithelial related antigen 
(MOC-31), anti-surfactant apoprotein A combined with one or more additional probes. 
anti-Cyclin A, anti-mature surfactant apoprotein B, anti-human epithelial related antigen 
(MOC-3 1), anti-VEGFi anti-surfactant apoprotem A combined with one or. more 
additional probes. 

anti-Cyclin A, anti-matiu-e surfactant apoprotein B, anti-human epithelial related antigen 
(MOC-3 1), anti-VEGF, anti-Cyclin Dl combined with one or more additional probes. 
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Detection (W/0 anti-Cyclin A) 

• aiiti-Ki-67 combined with one or more additional probes. 

anti-Ki-67 combined with any one probe selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen- (MOC-31), anti-TTF-1, anti-EGFR, anti- .. 
proliferating cell nuclear antigen and anti-mature surfactant apoprotein B. 
anti-Ki-67 combined with any two probes selected from the group consisting of anti- 
VEGF, anti-human epitheUal related antigen (MOC-3 1), anti-TTF-1, anti-EGFR, anti- 
proliferating cell nuclear antigen and anti-mature surfactant apoprotein B. 
anti-Ki-67 combined with any three probes selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-3 1), anti-TTF-1 . anti-EGFR, anti- 
proliferating cell nuclear antigen and anti-mature surfactant apoprotein B. 
anti-Ki-67 combined with any four probes selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-3 1), anti-TTF-1 , anti-EGFR, anti- 
pro liferating cell nuclear antigen and anti-mature surfactant apoprotein B. 

• anti-Ki-67 combined with any five probes selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-3 1), anti-TTF-1, anti-EGFR, anti- 
proliferating cell nuclear antigen and anti-mature siurfactant apoprotein B. 

anti-Ki-67, anti-VEGF, anti-human epithelial related- antigen (MOC-3 1), anti-TTF-1, anti- 
EGFR, anti-proliferating cell nuclear antigen and anti-mature surfactant apoprotein B. 
anti-Ki-67 combined with any one probe selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-3 1), anti-TTF-1, anti-EGFR, anti- 
proliferating cell nuclear antigen and anti-mature surfactant apoprotein B, and with one or 
more additional probes. 

anti-Ki-67 combined with any two probes selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-31), anti-TTF-1, anti-EGFR, anti- 
proliferating cell nuclear antigen and anti-mature surfactant apoprotein B, and with one or 
more additional probes. 

anti-Ki-67 combined with any three probes selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-31), anti-TTF-1, anti-EGFR, anti- 
. proliferating cell nuclear antigen and anti-mature surfactant apoprotein B, and with one or 
more additional probes. 
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• anti-Ki-67 combined with any four probes selected froin the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-3 1), anti-TTF- 1, anti-EGFR, anti- 
pro liferating cell nuclear antigen and anti-mature siufactant apoprotein B, and with one or 
more additional probes. 

anti-Ki-67 combined with any five probes selected from the group consisting of anti- 
VEGF, anti-human epithelial related antigen (MOC-3 1), anti-TTF- 1 , anti-EGFR, anti- 
proHferating cell nuclear antigen and anti-mature surfactant apoprotein B, and with one or 
more additional probes. 

anti-Ki-67, anti-VEGF, anti-human epithelial related antigen (MOC-31). anti-TTF-1, anti- 
EGFR, anti-proliferating cell nuclear antigen, anti-mature surfactant apoprotein B and one 
or more additional probes. 

Detection With Commerically Preferred Probes 

anti-Ki-67 combined with one or more additional probes. 

. • * anti-TTF-1 combined with one or more additional probes. 

• anti-EGFR combined with one or more additional probes. 

• anti-proliferating cell nuclear antigen combined with one or more additional probes, 
two probes selected from the group consisting of anti-Ki-67, anti-TTF-1., anti-EGFR and 
anti-proliferating cell nuclear antigen. 

three probes selected from the group consisting of anti-Ki-67, ahti-TTF-1, anti-EGFR and 
anti-proliferating cell nuclear antigen. 

anti-Ki-67, anti-TTF-1 , anti-EGFR and anti-proliferating cell nuclear antigen. 

• two probes selected from the group consisting of anti-Ki-67, anti-TTF-1, anti-EGFR and 
anti-proliferating cell nuclear antigen, and one or more additional probes. 

three probes selected from the group consisting of anti-Ki-67, anti-TTF-1, anti-EGFR and 
anti-proliferating cell nuclear antigen, and one or more additional probes. 
anti-Ki-67, anti-TTF-l , anti-EGFR, anti-proliferating cell nuclear antigen, and one or 
more additional probes. 
Discrimination Between Adenocarcinoma And Other Lung Cancers 
anti-mucin 1 and anti-TTF- 1 . 
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anti-mucin 1 and anti-TTF-1 combined with any one probe selected from the group 
consisting of anti-VEGF, anti-siirfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3. 

anti-mucin 1 and anti-TTF-1 combined with and two probes selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3. 

anti-mucin 1 and anti-TTF-1 combined with any three probes selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3. ■ . 

anti-mucin 1 and anti-TTF-1 combined with any four probes selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3. 

anti-VEGF, anti-surfactant apoprotein A, anti-mucin 1, anti-TTF-1, anti-BCL2, anti-ER- 
related P29 and anti-Glut 3 . 

anti-mucin 1, anti-TTF-1 and one or more additional probes, 
anti-mucin 1 *and anti-TTF-l combined with any one probe selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3, and with one or more additional probes. 

anti-mucin 1 and anti-TTF- 1 combined with and two probes selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3, and with one or more additional probes. 

anti-mucin 1 and anti-TTF-1 combined with any three probes selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3, and with one or more additional probes. 

anti-mucin 1 and anti-TTF-1 combined with any four probes selected from the group 
consisting of anti-VEGF, anti-surfactant apoprotein A, anti-BCL2, anti-ER-related P29 
and anti-Glut 3, and with one or more additional probes. 

anti-VEGF, anti-surfactant apoprotein A, anti-niucin 1, anti-TTF-1, anti-BCL2, anti-ER- 
related P29, anti-Glut 3 and one or more additional probes. 
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Discrimination Between Squamous Cell Carcinoma And Other Lung Cancers 
anti-CD44v6 combined with one or more additional probes. 

anti-CD44v6 combined with any one probe selected from the group consisting of anti- 
VEGF. and-thrombomodulin, anti-Glut 1, anti-ER-related P29 and anti-meianoma-r 
associated antigen 3. 

• anti-CD44v6 combined with any two probes selected from the group consisting of anti- 
VEGF, anti-thrombomodulin, anti-Glut 1, anti-ER-related P29 and anti-melanoma- 
associated antigen 3. 

• anti-CD44v6 combined with any three probes selected from the group consisting of anti 
VE OF, anti-thrombomodulin, anti-Glut 1, anti-ER-related .P29 and anti-melanoma- 
associated antigen 3. . 

anti-CD44v6 combined with any four probes selected from the group consisting of anti- 
VEGF, anti-thrombomodulin, anti-Glut 1, anti-ER-related P29. and anti-melanoma- 
associated antigen 3. 

anti-CD44v6, anti-VEGF, anti-thrombomodulin, anti-Glut 1, anti-ER-related P29 and 
anti-melanoma-associated antigen 3. 

• anti-CD44v6 combined with any. one probe selected from the group consisting of anti- 
VEGF, anti-thrombomodulin, anti-Glut 1, anti-ER-related P29 and anti-melanoma- 
associated antigen 3,.and with one or more addidonal probes. 

• anti-CD44v6 combined with any two probes selected from the group consisting of anti- 
VEGF, anti-thrombomodulin, anti-Glut 1 , anti-ER-relafed P29 and anti-melanoma- 
associated antigen 3 , and with one or niore additional probes. 

• anti-CD44v6 combined with any three probes selected from the group consisting of anti 
VEGF, anti-thrombomodulin, anti-Glut 1, anti-ER-related P29 and anti-raelanoma- 
associated antigen 3, and with one or more additional probes. 

• anti-CD44v6 combined with any four probes selected from the group consisting of anti- 
VEGF, anti-^thrombomodulin, antirGlut 1, anti-ER-related P29 and anti-melanoma- • 
associated antigen 3, and with one or more additional probes. 

anti-CD44v6, anti-VEGF, anti-thrombomodulin, anti-Glut 1, anti-ER-related P29, anti- 
melaiioma-associated antigen 3 and one or- more additional probes. 
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Discrimination Between Large Cell Carcinoma And Other Lung Cancers 

anti-VEGF combined with one or more additional probes. 

.anti-VEGF and anti-pl2a 
•\ anti-VEGF and anti-Glut 3. 

anti-VEGF, anti-p 120 and anti-Cyclin A. " ' . 

anti-VEGF, anti-p 120 and one or more additional probes. 

• . anti-VEGF, anti-Glut 3 and one or more additional probes. 

anti-VEGF, anti-pl20, anti-Cyclin A and one or more additional probes. 
Discrimination Between Mesothelioma Arid Other Lung Cancers 

• anti-CD44v6 combined with one or more additional probes. 

anti-pro liferating cell nuclear antigen, combined with one or more additional probes. 

• . anti-human epitheUal related antigen (MOC-3 1 ) combined with one or more additional 

probes. 

. two probes selected from the group consisting of anti-CD44v6, anti-pro hferating cell 
nuclear antigen and anti-human epithelial related antigen (MOC-3 1), combined with one 
or more additional probes. . 
■ anti-CD44v6, anti-pro liferating cell nuclear antigen, anti-human epithelial related antigen 
(MOC-3 1) and one or more additional probes. 
Discrimination Bebveen Small Cell And Other Lung Cancers 

• anti-proliferating cell nuclear antigen combined with one or more additional probes. 
anti-BCL2 combined with one or more additional probes. 

anti-EGFR combined with one or more additional probes. 

two probes selected frorn the group consisting of anti-proliferating cell nuclear antigen, 

anti-BCL2 and anti-EGFR. . 

anti-proliferating cell nuclear antigen, anti-BCL2, anti-EGFR. 

two probes selected from the group consisting of anti-proliferating cell nuclear antigen, 

anti-BCL2 and anti-JEGFR, combined with one or more additional probes. 

• anti-proliferating cell nuclear antigen, anti-BCL2, anti-EGFR and one or more additional 
probes. 
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Simultaneous Discrimination Of Adenocarcinoma, Squamous Cell Carcinoma, Large Cell 

Carcinoma, Mesothelioma And Small CeU Carcinoma 

two or.more probes selected from anti-VEGF, anti-thrombomodulin, anti-CD44v6, anti- 
surfactant apoprotein A, anti-proliferating cell nuclear antigen, anti-mucin 1, anti-human 
epithelial related antigen (MOC-3 1), anti-TTF-1, anti-N-cadherin, anti-EGFR and anti- 
proliferating cell nuclear antigen. 

• anti-VEGF, anti-thrombomodulin, anti-CD44v6, anti-surfactant apoprotein A, anti- 
pro liferating cell nuclear antigen, anti-nlucin 1, anti-hiunan epithelial related antigen 
(MOC-3 1), anti-TTF-1, anti-N-cadherin, anti-EGFR and anti-proliferating cell nuclear 
antigen. 

two or more probes selected from anti-VEGF, anti-thrombomoduiln, anti-CD44v6, anti- 
surfactant apoprotein A, anti-proliferating cell nuclear antigen, anti-mucin 1, anti-human 
epithelial related antigen (MOC-3 1 ), anti-TTF^ 1 , anti-N-cadherin, anti-EGFR and anti- 
proliferating cell nuclear antigen, combined with one or more additional probes. 
anti-VEGF, anti-thrombomodulin, anti-CD44v6, anti-surfactant apoprotein A, anti- 
proliferating cell nuclear antigen, anti-mucin 1, anti-human epithelial related antigen- 
(MOC-3 1), anti-TTF-1, anti-N-cadherin, anti-EGFR and anti-proliferating cell nuclear 
antigen, combined with one or more additional probes. 
5. Conclusions 

a. Validity of panel approach to molecular diagnostics 

i. Non-intuitive solutions 

{0386] Histograms were plotted (PathologistData.xls, worksheet: Histograms) showing the 
distribution of marker scores for each probe for Control vs. Cancer. It is clear from these 
histograms that an intuitive selection of probes for specific panels is certainly not obvious and the 
invention described does allow effective combinations to be found in the absence of an obvious 
method. 
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ii. Optimization for varied product applications 

iii. Robustness of approach demonstrated by similar results using 
different methods 

[0387] Detailed scrutiny of the results obtained by the various analyses in the body of this 

report, and as sunmiarized in the tables and figures, shows the following findings. 

[0388] 1. Careful scrutiny of the performance of individual probes does apt make apparent 

probe combinations that might perform better than any one probe alone. 

[0389] 2. All three classification methodologies evaluated hone in on similar sets of 

features. The small differences can be attributed to the data structure that may favor one 

classifier over another. 

[0390] 3. All the classifiers designed with one of these methods were shown to give good 
performance when tested on data from an independent pathologist, unseen during the design 
process. This gives high confidence in the invention. 

[0391] 4. A detection panel based on probe 7 alone gives a high performance. 
[0392] 5. If probe 7 is combined with probe 16 or 25 then abetter performance is obtained. 
[0393] 6, While combinations, of other probes .with probe 7 appear to improve performance 
further, the number of extra cases captured is so low that they may be unrepresentative and the 
classifier so designed may not generalize. 

[0394] 7. The performance of panels selected firom probes excluding probe 7 provided some 
discrimination, good enough in comparison with current practice using human screening, but 
perhaps not good enough for an automated cytometer in tomorrow's clinical diagnostic cytology 
world (see Figure 6). 

[0395] 8. Other combinations of probes can provide a usefiil, but lesser, performance. • 
[0396] 9. If some probes become unavailable this invention allows the selection of other 
combinations of probes. This was illustrated by classifier designs based on a commercially 
preferred set of probes only. See Figure 7. 

[0397] 10, The invention allows a weighting to be applied against costly probes. Rather than 
totally excluding them firom the analysis this allows their inclusion in the panel if their 
contribution is important. 

[0398] 11. The invention allows the design of single lung cancer type specific discrmination 
panels that can discriminate one type of lung cancer from among all other cancers. 
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[0399] 12. Analysis of the perfomiance of a single panel to classify five cancers showed 
discrimination was possible but the overall error rate was worse than a set of five panels each 
designed to discriminate one of the cancers from the others. 

[0400] 13. A very useful discrimination was obtained with the combination of five ^vo way 
classifiers. 

[0401] 14. Common sets of probes were selected by the three classification methodologies 
for the five discrimination panels, again giving confidence in this result. 

Probes for isolating cases of Adenocarcinoma are 4, 14, 19, 20, 25, and 27. 
Probes for isolating cases of Squamous Cell cancer are 1, 2, 3, 24, 25, and 26. 
Probes for isolating cases of Large Cell cancer are 1 and 7, or 1 and 21. 
Probes for isolating cases of Mesothelioma are 3, 12, and 16. 
Probes for isolating cases of Small Cell cancerare 12, 20, and 23. 
Probes for recognizing all cancers simultaneously are 1, 2, 3, 4, 12, 14, 19, 22, 23, 

An advantage of using the multiple pair-wise panels as defined by this invention is 
that doubtflil cases may not score on any of the five panels, also confusing cases may show, on 
two or more panels. Such anomalous reports woiild alert the cyto legist that fiarther analysis is 
indicated. 

. iv. Risk Management Study 
[0409] All the tests applied in this study were statistical in nature. There is a risk that probes 
selected on the basis of small improvements in performance will have statistical variations when 
tested on new data. To give confidence iu the results, the best classifier emerging from the 
Linear. Discriminant analysis on the Pathologist 1 and Pathologist 2 data was tested. It should be 
remembered that the Pathologist 3. data was statistically different from the Pathologist 1 and 
Pathologist 2 data, so if good performances axe obtained when tests using the Pathologist 3 data, 
then this would be encouraging indeed. 

(1) Report on Testing with unseen data - Detection panel 
(a) Method 

[0410] In the Section titled "Detection Panel(s) Composition" above, we showed that good 
classification is obtaiined with features 7 and 16. Using SPSS all the Pathologist 3 data that 
reported H scores for both 7 and 16 was selected. Then, using Transform and Compute, the . 
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canonical discrimination function was generated as a new feature. The perfoimance of this 
feature alone was then tested. 

(b) Results 

:[0411] These are the results of testing the classifier designed on Pathologist 1 and Pathologist 2 
data and testing on Pathologist 3 data. The classifier was designed using the Unear discriminant 
function on probes 7 and 16. The Canonical Pathologist 2 function was =0.965*Probe7- 
0.298*Probel6. 

Classification Results on Pathologist 3 data using probes 7 and 16 



Predicted Total 
Group 
Membership 

■ Diagnosis 0 1 
(UCL7\) 

Original Count 0 20 1 21 

1 6 41 47 

% . 0 95.2 4.8 100.0. . 

1 12.8 87.2 100.0 

CrosS'validated Count 0 20 1 21 

1 • . 6 41 47 

% 0 95.2 4.8 100.0 

1 12.8 87.2 100.0 



« a Cross validation is done only for those cases in the analysis. In cross validation, each case is classified by the 
functions derived from all cases other than that case, 
b 89.7% of original grouped cases coaectly classified, 
c 89.7% of cross-validated grouped cases correcdy classified. 

[0412] This is better than classifying the Pathologist 3 data on. probe 7 only show as follows: 
Classification. Results on Pathologist 3 data using probe 7 only 



Predicted Total ' 
Group 
Membersliip 

Diagnosis 0 1 
(UCLA) 

Original Count * 0 20 1 21 

1 8 39 47 

% 0 95.2 4.8 100.0 

1 17.0 83.0 100.0 

Cross-validated Count 0 20 1 21 

1 8 39 47 

% 0 95.2 4.8 100.0 

1 17.0 83.0 100.0 



a Cross validation is done only for those cases in die analysis. In cross validation, each case is classified by the 

functions derived from all cases other than that case. • . 

b 86.8% of original grouped cases correctly classified. 

c 86.8% ofcross-validated grouped cases correcdy classified. 
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(c) Conclusion 

[0413] This gives confidence, that the two-parobe classifier based on 7 and 16 is better than 
probe 7 alone. . . . . 

(2) Report on testing with unseen data - Discrimination Panel 

(a) Background 

[0414J Reported below is the performance of the classifier designed with Patliologist 1 and 
Pathologist 2 data using LDF and tested with the unseen Pathologist 3 data. The numbers of 
cases at the design stage was relatively small and the numbers in the test data are also small, so a 
good degree of variability can be expected between performance on the first and second set. 

(b) . Method 

[0415] In SPSS, the canonical discrimination functions derived in the section titled "Pattern 
recognition", were built and tested on Pathologist 3 data for all five classes of cancer, 

(c) Results 

[0416] Mesothelioma LDF= probe3sc * .385 - probel2s * .3 17 + probel6s * 1 .006. 
Classification Results 



Original 



Cross- 
validated 



a Cross validatiori is done only for those cases in the analysis. In cross validation, each ciase is classified 

by the functions derived from all cases other than that case. 

b 93.8% of original grouped cases correctly classified. 

c 93.8% of cross-validated grouped cases correctly classified. 
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100.0 






1 


12.5 


87.5 


100.0 
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[0417] Small cell cancer LDF= probel2s * .575 
probe23s* .344. 



probe20s * .408 - probe22s * .423 + 



Classification Results 



Original 



Cross- 
validated 



Count 



Count 



Small = 1, 
others = 0 
0 

■ 1 
0 
1 

■ 0 



Predicted 
Group 
MembershI 

P 
0 

39 

1 

92.9 
16.7 
39 



Total 



1 

92:9 
16.7 



1 

3 
5 
7.1 
83.3 
3 

5 
7.1 
83.3 



42 
6 

100,0 
100.0 
42 

6 

100.0 
100.0 



a Cross validation is done only for those cases In the analysis, 
by the functions derived from all cases other than that case, 
b 91.7% of original grouped cases correctly classified, 
c 91 .7% of cross-validated grouped cases correctly classified. 



In cross validation, each case is classified 



[0418] Squamous ceU cancer LDF= - probe Isc * .328 - probe2sc * .295 + probe3sc * .741 + 
probe24s * .490 + probe25s * .393 + probe26s * .426, 



Classification Results 







Predicted 
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a Cross validation is done only for those cases in the analysis. In cross validation, each case is classified 

by the functions derived, from ail cases other than that case. 

b 87.0% of original grouped cases correctly classified. 

c 87.0% of cross-validated grouped cases correctly classified. 
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[0419] Large cell cancer LDF= probelsc * .847+ probeVsc * .452. 
Classification Results 



Predicted 
Group 



total 



Original 



Cross- 
vaiidated. 
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a Cross validation is done only for those cases in the analysis. In cross validation, each case is classified 
by the functions derived fronn all cases other than that case, 
b 59.6% of original grouped cases correctly classified, 
c 59,6% of cross-validated grouped cases correctly classified. 

[0420] The lower, but useful, performance was on a classifier designed and tested with a very 
small number of cases of large cell cancer, so thiis result is still very encouraging. 
[0421] ■ Adenocarcinoma, LDF= - probe4sc * .515 + probe5sc * .299 - probeUs * .485 - 
probel9s * .347 + probe20s * .723 + probe25s * .327 + probe27s * ,327. 
Classification Results 



Original 



Cross- 
validated 
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a Cross validation is done only for. those cases in the analysis. In cross validation, each case is classified 
by the functions derived fronn all cases other than that case. . 
b 89.6% of original grouped cases correctly classified, 
c 89.6% of cross-validated grouped cases correctly classified. 



(d) Conclusion 

[0422] It is very encoiuaging to note the perfonrtance of these classifiers stand up to the tests of 
applying unseen data. This gives a very high confidence in the ability to detect the individual 
cancers. 
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(3) Training and testing on data from different patients and 

pathologists 

[0423J As a "final final" test of robustness aLDF was trained on the data that was reviewed by 
•both Pathologist 1 and Pathologist 2. This removes data reviewed by Pathologist 3. Hence testing 
on data reviewed by both Pathologist 3 plus Pathologist 1 data is not biased. Previously the test 
process was biased through using data from the same patient for test and train. 
[0424] LDF produced the same set of features except for probe 4 which was not included. The 
LDF was = probelsc * .288 + probe7sc * .846 - probel5s *..249 - probel6s * .534. 
Classification Results 

Area under the Curve = .977 



Original 



Cross- 
validated 



Count 
% 

Count 
% 



Diagnosis 
(UCLA) 
0 
1 
0 
1 
0 

1 

0 

1 



Predicted 
Group 
Membership 
■ 0 

20 
9 

100.0 



• 0 
37 
.0 

19.6 80.4 
20 0 



'9 
100.0 



37 
.0 



19,6 80.4 



Total 



20 
46 
100.0 
100.0 
20 

46 

100.0 
100.0 



a Cross validation is done only for those cases in the analysis. In cross validation, each case is classified 

by the functions derived from ail cases other than that case. 

b 86.4% of original grouped cases correctly classified. 

c 86.4% of cross-validated grouped cases correctly classified. 
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[0425] Still a reasonable result, but a similar result, but with a smaller area under the curve, 
was obtained with probe? alone on Pathologist 3 only data. 
Classification Results . 

Area under Ihe curve = .908 
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a Cross validation is done only for those cases in [he analysis. In cross validation, each case is classified 

by the functions derived from all cases other than that case. 

b 87.9% of original grouped cases correctly classified. 

c 87.9% of cross-validated grouped cases conrectly classified. 

n. Colorectal Cancer 
[0426] Epithelial tumors of intestines are a major cause of morbidity and mortality worldwide. 
The colon (including the rectum) is host to more primary neoplasms than any other organ in the 
body. Colorectal cancer ranks second only to bronchogenic carcinoma among the cancer Icillers. 
Adenocarcinomas constitute the vast majority of colorectal cancers and represent 70% of all 
malignancies arising in the gastrointestinal (GI) tract. The small intestine is an uncommon site 
for benign or malignant tumors despite its gre^at length and vast pool of dividing mucosa cells. 
(Crawford, J.M., The Gastrointestinal Tract, mRobbins Pathologic Basis of Disease, R.S.e.a. 
Cotran, Editor. 1999, W. B. Saunders Company: Philadelphia, p. 775-843). 
[0427] The peak incidence for colorectal carcinomas is in the patient age range of 60 to 79 
years. Fewer than 20% of cases occur before the age of 50 years. When colorectal carcinoma is 
found in a young person, pre-existing ulcerative colitis or one of the polyposis syndromes must 
be suspected. Colorectal carcinoma has worldwide distribution. The highbst death rates are 
found in the United States and Eastern European countries, up to 10-fold greater than the rates in 
Mexico, South America and Africa. Environmental factors, particularly dietary practices, are 
implicated in these strilcing geographic contrasts. In addition, many studies implicate obesity and 
physical inactivity as risk factors for colon cancer. (Crawford, J.M., The Gastrointestinal Tract, 
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. in Robbins Pathologic Basis of Disease, R.S.E.A. Cotran, Editor. 1999, W. B. Saunders 
Company: Philadelphia, p. 775-843). 

[G428] Almost all cancers (98%) found in the large intestine are adenocarcinomas. Virtually all 
: . colorectal carcinomas exhibit genetic alterations, including E-cadherin and P-catenin in 
adenomatous polyposis coli (APC); human mismatch repair genes, hjVISH2, hMLHl and hPMS2 
in hereditary non-polyposis colon carcinoma (HNPCC); and, mutation of K-ras and p53 genes. 
(Crawford, J.M., The Gastrointestinal Tract, in Robbins Pathologic Basis of Disease, R.S.E.A. . 
Cotran, Editor. 1999, W, B. Saunders Company:.Philadelphia. p. 775-843). Diagnosing 
colorectal carcinoma in an early stage is one of the prime challenges to medical professionals 
because these carcinomas present with unspecific clinical symptoms such as fatigue, anemia, 
abdonainal pain and bloody stools. Currently, the main diagnostic method is colonoscopy to 
visually examine whether there is a tumor mass. However, this is an invasive procedure that 
involves colon-prepping by the patient and anesthesia diroughout the procedure to achieve 
unconsciousness. It is not surprising that patient compliance is a major issue. Unfortunately, 
available fecal occult blood tests or the Guaic tests are not specific enough. Therefore, ciuxent 
detection methods for colorectal cancer, such as colonoscopy or sigmoidoscopy, have proven to 
be inadequate screening tools due to the invasiveness of the procedures, the relative lack of 
accuracy and poor patient compUance. Furthermore, non-invasive fecal occult blood testing 
(FOBT) is not effective and suffers from lack of sensitivity or specificity. 
[0429] Molecular diagnosis of colorectal carcinoma receives much attention because most of 
these cancers have genetic abnormalities. Among technologies, immunohistochemistry (IHC) 
and immunocytochemistry (ICC) are widely used to evaluate colorectal carcinogenesis. A . 
variety of colorectal tumor markers have been discovered to aid physicians in making timely, 
precise diagnoses, and to provide significantly better patient management. Unfortimately, none 
of these tumor markers is a "magic bullet" with both a high sensitivity and specificity. 
Therefore, alternative ways to enhance diagnostic accuracy are necessary. 
[0430] An alternative way to enhance diagnostic accuracy is to develop a panel comprising a 
plurality of probes each of which specifically binds a marker associate with colorectal cancer. 
All candidate probes are to be tested with ICC techniques. In some embodiments, specimens 
may be obtained from colonic washings. Although cytological specimens obtained from colonic 
washings often have fewer cells than tissue sections, the use of high quality polyclonal or 
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monoclonal antibodies may be employed to ensure good asjsay performance. In some 
embodiments, test slides may be made by spiking tumor cells into a cell suspension before actual 
patient specimens are tested. In other embodiments, limitations due to colonic washings often 
having fewer cells than tissues sections may be overcome by studying patients who match the 
variables as closely as possible, such as age, gender, diagnosis, tumor grade, tumor size, clinical 
stage, etc. . 

[0431] Once the specimens are collected, the specimens will be processed and analyzed. 
. Statistical analysis will be used to desiign panels, as described above for limg cancer. During 
processiiig, technical issues such as cell smears or pellets not sticking to slides during harsh 
washings may occur in some embodinaents. However, such issues can readily be addressed by 
manipulation of software or modifying staining protocols to mitigate such problems. In some 
embodiments, the specimens will be processed and analyzed using a device that automatically 
samples the specimen, and prepares slides for diagnosis. It is anticipated that a broad menu of 
probes will be used initially. The number of probes will be pruned to a suitably sized panel in 
order to retain a high level of sensitivity and specificity. Selection of the final probes will be 
based on a pre-defined threshold, of the percentage of positive stained tumor cells. Sophisticated 
statistical analysis will be employed to make these determinations. Since the panel-assay 
approach to detecting malignancies is applicable to solid tumors, and several of the same tumor 
markers are in different panels, this method may be carried out in parallel, as well as serially. In 
this manner, the assay development process can be expedited. 

(0432) Compared with lung cancers, which have five subtypes (adenocarcinoma, squamous cell 
carcinoma, small cell carcinoma, large cell carcinoma, and mesothelioma), colorectal epithelial 
tumors are predominantly adenocarcinoma. This allows the colorectal tumor panel to be 
specifically targeted at only one type of cancer. A large nunaber of cytological specimens is not 
necessary because the panel can be tested on either biopsied or colectomy tissue. . 

Library of Probes/Markers 
[04331 Various sources containing information about cancer markers were reviewed. Ah 
arbitrary criterion of 20% or greater positivity of colorectal carcinoma was used to select probes 
for a preferred panel for detection and/or diagnosis of colorectal cancer. The term "20% or 
greater positivity" means that if 100 tumor cases were studied, 20 pr more of these cases would 
have shown a presence of the mdividual marker, while the remaining 80 cases would not have 
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shown a presence of the individual marker. A preferred panel may include molecular markers . 
selected from AKT, p-catenin, Brain-type Glycogen Phopshorylase (BPG), Caveolin -1, 
CD44v6, cFLIP, Cripto-I, Amphiregulin, Cyclin Dl, Cyclooxygenase (COX-2), Cytokeratin 20 
(CK20), Carcinoembryonic Antigen (CEA), E-cadherin, Bcl.2, Bax. hJ/ILHl, hMSH2, 
Epidermal Growth Factor Receptor (EGFR), Ephrin-B2 (Eph-B2), Ephrin-B4 (Epl>B4), FasL, 
HMGI(Y), Ki-67, Lysozyme, Matrilysin (MMP-7), pl6, p68, Retinoblastoma (Rb), cdk2/cdc2, 
S 1 00 A4, YB- 1 and p53 , A brief description of the library of probes/markers utilized in the 
present example is provided below.. 
AKT 

[0434] This is a proto-oncogen with profound anti-apoptotic activity. This serine-threonine 
kinase is over-expressed in numerous malignancies. Roy et al showed normal colonic mucosa 
and hyperplastic polyps exhibited no significant AKT expression, in marked contrast to the 
dramatic AKT ixmnunoreactivity seen in colorectal cancers (57% positive). In addition, AKT 
was also detected in 57% of adenomas indicating over-expression of this proto-oncogen as an 
early event during colon carcinogenesis. (Roy, HK., et al, AKT proto-oncogene overexpression 
is an early event during sporadic colon carcinogenesis. Carcinogenesis, 2002. 23(1); p. 201-5). 
P-catenin 

[0435] Tumor-suppressor gene, adenomatous polyposis coli (APC), has been detected in 80% 
. of sporadic colorectal carcinomas. They occur in small-sized adenomas and even in the smallest 
lesions with the risk of neoplasia, the dysplastic abenrant crypt foci. Products of the APC gene 
regulate intracellular p-catenin. Studies have shown that mutated APC causes decreased turnover 
rate and leads to an accumulation of P-catenin. Herter et al discovered not only a linear increased 
expression level of p-catenin but also a different location of P-catenin from adenoma to- 
carcinoma, in a majority of cases. In adenomas with mild and moderate dysplasia, P-catenin is 
only present in the nucleus. In severe dysplastic adenomas and carcinomas, it is present in both 
cytoplasm and the nucleus. In normal colonic mucosa, the only weak staining is in the cell-to- 
, ceU border membranes and cytoplasm. These results may be important for diagnostic and 
clinical purpose, because the nuclear presence of P-catenin may be the earUest molecular 
evidence of colorectal malignancy. (Herter, P., et .Intracellular distribution ofbetq-catenin in 
colorectal adenomas, carcinomas and PeutzrJeghers polyps, J Cancer Res Clin Oncol, 1999, 
125(5): p. 297-304). • 
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Braiiti'type Glycogen Phosphorylase (BPG) . . 
[0436] Brain-type glycogen phosphorylase (BPG) is a unique intestinal malignancy marker. It 
has three major isoforms: muscle, liver and brain. Previously, gastric carcinoma has been shown 
to be associated with abnomial expression of BPG through immuno-histochemistry (IHC) 
techniques. A new study by Tashima et al demonstrated for the first time that BPG was present 
in 83% of colorectal carcinoma by IHC. And more interestingly, normal colonic mucosa from 
remote sites are negative for BPG whereas overtly nonnal-looking mucosa adjacent to a tumor 
are positive. (Tashima, S., et al. Expression of brain-type glycogen phosphorylase is a 
potentially novel early biomarkier in. the carcinogenesis of human colorectal carcinomas. Am J 
Gastroenterol, 2000. 95(1): p. 255-63). . 

Caveolin-1 

[0437] This is a major stmctiu-al protein of caveolae, the vesicular invaginations of the plasma 
membrane. Studies have demonstrated that caveolin family members contain a common domain, 
termed caveolin-scaffolding, that functions, to organize signahng molecules, including G-protein, 
Ha-ras, Src-family tyrosin kinases, and epidermal growth factor receptor (EGFR). While in vitro 
and in vivo animal experiments demonstrated .a suppressive effect of caveolin-1 in cell 
transformation and breast carcinogenesis, other studies, including studies of human breast and 
prostate cancers, revealed a positive association of caveolin-1 expression with tumorigenesis and 
progression, suggesting a tumor-promoting function. Fine et al's study showed 88% of colonic 
adenocarcinomas are positive for caveolin-1 by IHC. (FinCj S.W., et al., Elevated expression of 
caveolin-J in adenocarcinoma of the colon. Am J Clin Pathol, 2001. 115(5); p. 719-24). 
CD44v6 

[0438] This is a widely expressed cell-surface glycoprotein that may be involved in cell-to-cell 
and cellTto-matrix interactions. An abimdance of CD44s is present on cells of nonnal epithelial 
and hematopoietic origin. In contrast, the alternatively spliced CD44 variants (CD44v) are 
expressed predominantly on cells and timiors of epithelial origin. Several reports have shown 
expression of CD44v6 with an advanced stage of colorectal carcinoma. Ishida studied 63 
colorectal carcinoma patients through IHC techniques and found 59% of the cases to be positive 
for CD44v6. Normal colonic mucosa are negative for CD44v6. (Ishida, T., 
Immunohistdchemical expression of the CD44 variant 6 in colorectal adehocarcinqma. Surg 
Today, 2000. 30(1): p. 28-32). 
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Cellular FLICE-like InMbitory Protein (^^^ 
[0439] Cellular FLICE-like Inhibitory protein (cFLIP) is an endogenous inhibitory regulator of 
Fas-mediated apoptosis. Although the physiological functions of cFLIP have not yet been 
clarified, pathogenetic' implications in disease, including tumors, have been suspected. 
Moreover, it has recently been reported that overexpression of cFLIP results in the escape of 
tumor cells from T-cell immunity and is possibly related to, tumor establishment and growth. 
Ryu et al have demonstrated that 100% (52/52) of colonic adenocarcinomas are positive for 
cFLIP. Normal colonic epithehum, adeonomatous polyps have much lower staining intensity. 
(Ryu, B.K., et aL, Increased expression bf cFLIP(L) in colonic adenocarcinoma, J Pathol, 2001. 
194(1): p. 15-9). ^ 

Cripto-I and Amphiregulin 
[0440] Cripto-I (CR-I) and amphiregulin (AR) are epidermal growth factor (EGF)-related 
peptides. Several reports have demonstrated that AR and CR-I function as autocrine growth 
factors in human colon epithelial cells in vitro. Furthermore, it has been demonstrated that AR 
and CR-I are expressed in a majority of human primary colon carcinomas. In particular, 
overexpression of either AR or CR-I proteins has been foimd by IHC in approximately 70% of 
human colon adenomas and carcinomas. (De Angelis, E., et al., Expression ofcripto and 
amphiregulin in colon mucosa from high risk colon cancer families, Int J Oncol, 1999. .14(3): p, 
437-40). ' . 

CyclinDl 

[0441] This protein plays an important role in cell proliferation. Mutations and/or altered 
expression of Cyclin Dl are involved in neoplasia. Increased expression of Cyclin Dl is 
observed in esophageal, head, neck, hepatic, breast and colorectal cancers. A study by Arber et 
al revealed increased Cyclin Dl staining in 30% of colorectal adenocarcinomas and 34% of 
adenomatous polyps but not in hyperplastic polyps or normal mucosa. (Arber, N,, et al, 
Increased expression of cyclin Dl is an early event in multistage colorectal carcinogenesis. 
Gastroenterology, 1996. 110(3): p. 669-74). 

Cyclooxygenase (COX-2) 
[0442] This is a prostaglandin synthase enzyme involved in arachidonic acid metabolism. 
Since evidence showed tumor-suppressive effects of nonsteroidal antiinflanmiatbry drugs 
(NSAIDs) on colorectal cancer, Cyclooxygenase has received attention because it is plausible 
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that the turaor-suppressive effects of NSATOs are due to their reduction of COX-2 activity. 
Sakuma et al demonstrated that 38% of colorectal cancers are COX-2 positive. COX-2 staining 
was more intense in areas of cancer tissue than in non-cancerous tissues. In some specimens, 
tissue close to the cancer was stained more intensively than tissue further away from the cancer. 
(Sakuma, K., et al, Cyclooxygenase (COX)-! immunoreactivity and relationship to p53 andKi- 
67 expression in colorectcd cancer, J Gastroenterol, 1999. 34(2): p. 189-94). 

Cytokeratin 20 (CK 20) and Carcinoembryonic antigen (CEA) 
[0443] These are well known colorectal carcinoma markers. They have been reported in many 
scientific "articles. Even though they are not specific for colorectal tumor, one study by 
Lagendijk et al revealed that inrniunohistochemically CK 20 and CEA are two key discriminative 
markers to differentiate naetastatic cplonic adenocarcinoma to ovary from breast and primary 
ovarian carcinoma, (Lagendijk, J.H., et al., Immunohistochemical differentiation between 
primary adenocarcinomas of the ovary and ovarian metastases of colonic and breast origin. 
Comparison between a statistical and an intuitive approach, J Clin Pathol, 1999. 52(4): p. 283- * 
90). 

E-cadherin, p53, Bcl-2, Bax, hMLHl, hMSH2 
[0444] Kapiteijn et al and Bukholm et al, in separate studies discovered a number of onco- 
genes and tumor suppressor genes involved in the oncogenesis of colorectal cancers. E-cadherin, 
p53, Bcl-2, Bax, all showed greater than 20% immunostaining in tumor cells. Mismatch repair 
genes, hMLHl and hMSH2 are also significantly increased. These repair genes are involved in- 
genetic "proof-reading"' during DNA replication, and hence are referred to as caretaker genes. 
* Mutation of these genes has been shown to be involved in the early development of 
gastrointestinal malignancy. (Kapiteijn, E., et al, Mechanisms of oncogenesis in colon versus 
rectal cancer, J Pathol, 2001. 195(2): p. 171-8; Bukhohn, I.K. and J.M, Nesland, Protein 
expression ofp53. p21 (WAFl/CIPl), bcl-2, Bax, cyclin Dl andpRb in human colon 
carcinomas. Virchows Arch, 2000. 436(3): p. 224-8). Yantiss et al showed p53 and E-cadherin 
are two markers differentiating adenomas with misplaced epithelium from adenomas with 
invasive adeocarcinoma, indicating the specificity of these two markers for colorectal cells. 
(Yantiss, R.K., et aL, Utility ofMMP-l p53, E-cadherin, and collagen IV immunohistochemical 
stains in the differential diagnosis of adenomas with misplaced epithelium versus adenomas with 
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invasive adenocarcinoma. Am J Surg Pathol, 2002. 26(2): p. 206-15). These markers are 
common to almost all the solid tiimors. 

Epidermal Growth Factor Receptor (EGFR) . 
[0445] Epidermal growth factor receptor (EGFR) is a 1 70-ldlodalton transmembrane cell- 
surface receptor. It; along with c-erb B-2, c-erb B3, and c-erb B4, has tyrosine kinase activity 
and is encoded by the c-erb-B protooncogene. Chimeric anti-EGFR monoclonal antibody is an 
investigational therapy for advanced stages of colon adenocarcinoma. Increased levels of EGFR 
are found in many solid tumors, including colorectal carcinoma, squamous cell carcinoma of the 
lung, head, neck, cervix, breast, prostate and bladder. One study by Goldstein et al showed 75% 
of the colonic adenocarcinoma had EGFR immunohistochemical positivity (Goldstein, N.S. and 
M. Armin, Epidermal -growth factor receptor immunohistochemical reactivity i7i patients with 
American Joint Committee on Cancer Stage IV colon adenocarcinoma: implications for a 
standardized scoring system. Cancer, 2001. 92(5): p. 1331-46). 

Ephrin-B2 (Eph-B2) and Ephrin-B4 (Eph-B4) 
[0446] The erythropoietin-producing amplified sequence (Eph) family is the largest sub-family 
of receptor tyrosine kinases (RTKs). Eph-B2 and B4 are the ligands binding to Eph. The ephrin- 
Eph system is important in embryplogical development and differentiation of the nervous and 
vascular systems. Several studies have shown that high expression of ephrins may be associated 
with increased potential for tumor growth, tumorigenicity, and metastasis. Using . . 
immunohistochemical analysis, Liu et al showed Eph-B2 and Eph-B4 had greater staining 
intensity in 100% (5/5) of the cases studied compared with adjacent normal mucosa. (Liu, W., et 
al., Coexpression of ephrin-Bs and their receptors in colon carcinoma. Cancer, 2002. 94(4): p. . 
934-9). 

FasL 

[0447] This is a transmembrane protein member of the tumor necrosis factor super- family, and 
induces cell death in apoptosis-sensitive cells expressing its receptor, Fas (CD95/ APO-I). It has 
been widely demonstrated that FasL is up-regulated in several types of cancer. Moreover, in 
vitro and in vivo studies have shown that FasL can enable cancer cells to mount a Fas 
counterattack, impairing the immune response by inducing apoptosis in anti-tumor immune 
effector cells. These findings suggest that FasL expression by cancer cells may be an important 
factor in the inhibition of anti-tumor immune responses. Belluco et al showed that FasL 
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expression is a relatively early event of carcinomas and in colorectal tumorigenesis. ' FasL 
expression was found in 28% of hyperplastic polyps, 76% of low grades and 93% of high-grade 
polyps (Beliuco, C, et al, Fas ligancl is up-regidated during the colorectal adenoma'Carcinoina 
sequence, Eur J Surg Oncol, 2002. 28(2): p. 120-5). The results are in line with others findings 
that FasL expression was detected in 81% of carcinomas and in 41% of adenomas. Moreover, 
FasL was significantly more frequently expressed in high-grade dysplastic adenomas than in low- 
grade adenomas. 

HMGI(Y) 

(0448] Proteins HMG-I, HMG-Y and HMGl-C constitute the high mobility group I protein 
family. The first two proteins are encoded by the same gene, HMGI(Y), through alternative 
splicing, while HMGI-C is the product of a different gene. HMGI genes are involved hi the 
generation of benign and malignant tumors. Previous reports, showed HMGI(Y) proteins are 
abundantly expressed in colon carcinoma cell lines and tissues but not in nonnal colon mucosa. 
Chiappetta et al discovered 36 colorectal carcinomas were all positive for HMGI(Y) by IHC, 
whereas no expression was detected in nonnal colon mucosa. HMGI(Y) expression in adenomas 
was closely correlated with the degree of cellular atypia. Only 2 of the 18 non-neopiastic polyps 
tested were HMGI(Y)-positive. (Chiappetta, G., et al, High mobility group HMGI(Y) protein 
expression in human colorectal hyperplastic and neoplastic diseases, Int J Cancer, 2001, 91(2): 
p. 147-51). These results indicate that HMGI(Y) protein induction is associated with the early 
stages of neoplastic transformation of colon cells and only rarely with colon cell 
hyperproUferation. . 

Ki-67' 

[0449] Ki-67 is a cell proliferation nuclear marker. It is expressed in. a variety of tumors, 
including colorectal cancer. One study by Sakuma et al demonstrated 48% of colorectal cancers 
are positive for Ki-67. (Sakimaa, K., et al., Cyclooxygenase (C0X)-2 immunoreactivity and 
relationship to p53 and Ki-67 expression in colorectal cancer, J Gastroenterol, 1999. 34(2): p. 
189-94). 

Lysozyme 

[0450] Lysozyme is an enzyme with a broad spectrum of antibacterial activities. It is present in 
numerous human tissue fluids and secretions, including saliva, tears, mils, serum, and gastric and 
small intestinal juice. Lysozyme is absent in normal colonic epithelium. Interestingly, numerous 
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immunohistochemical studies demonstrated the expression of lysozyme in the tumor cells of 
gastric adeaomas and adenocarcinomas. Many studies have s-hown lysozyme positiyity in colon 
cancer ranging jfrom 28% to 80% while normal colonic glands adjacent to the adenocarcinomas 
: did not show any lysozyme protein expression, (Yuen, S.T., et al, Up-regulation of lysozyme 
production in colonic adenomas arid adenocarcinomas, Histopathology, 1998. 32(2): p. 126-32). 
Matrilysin (MIVIP-7) 

[0451] Matrilysin is a member of the MMP gene family and has proteolytic activity against a 
spectrum of substrates, such as collagens, proteoglycans, elastin, laminin, fibronectin and casein. 
It is produced by malignant tumor cells such as esophageal, colorectal, gastric, head, neck, lung, 
prostate and heptocellular carcinomas. Immunohisto-chemical studies have shown that the 
expression of matrilysin . correlates significantly with nodal or distant metastasis in gastric and 
colorectal carcinomas, A study by Masaki et al showed 34% of colorectal carcinomas are 
positive for matrilysin. (Masaki, T., et al, Matrilysin (MMP-?) as a significant determinant of 
malignant potential of early invasive colorectal carcinomas, Br J Cancer, 2001. 84(10): p. 1317- 
21). 

pl6 

[0452] This is a cell cycle inhibitor and a major tumor-suppressor protein. A role for p 16 in 
intestinal neoplasia is suggested by the observation that the promoter region is methylated in a 
subset of human colon tumors. Dai et al showed pl6 expression was very low in normal mucosa,' 
and in 18 of 28 primary colon carcinomas and 5 of 5 metastatic colon carcinomas. In addition, 
pi 6 staining correlated inversely with that of Ki-67, cyclin A and the retinoblastoma protein, 
suggesting cell cycle progression was inhibited. (D^i; C.Y., et d:\.,pl6(INK4a) expression begins 
early in human colon neoplasia and correlates inversely with markers of cell proliferation. 
Gastroenterology, 2000. 1 19(4): p. 929-42). 
p68 ■ 

[0453] This is an interferon-inducible protein kinase, which is a key factor m the regulation of 
both viral and cellular protein synthesis. Its expression is correlated with cellular differentiation : 
in both normal and neoplastic cell types. Singh et al have found that p68 is positive in 76% of 
colorectal carcinoma patients. Normal colonic mucosa showed weak p68 staining. High p68 . 
expresision demonstrated a trend toward improved survival. Patients with tumojrs expressing high 
levels (3 to 4+) of p68 had a longer 5-year survival rate compared to patients with lower p68 
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expression. (Singh, C, et al.. Expression of p68 in human colon cancer Tumour Biol, 1995. 
16(5): p. 281-9). 

Rb and cdk2/cdc2 . 
[0454] The retinoblastoma (Rb) gene is a tumor-suppressor gene and its product, pRB, is • 
known to act as a negative regulator of the cell cycle. Although lack of pRB expression resulting 
from gene alterations is considered to be responsible for the genesis of several malignancies, 
including osteosarcomas and carcinomas of the lung, breast and bladder. In contrast, colorectal 
cancer has reportedly shown infrequent inactivation of this gene, and Southern blot analysis has ■ 
demoristrated Rb gene amplification, in approximately 30% of colorectal cancers. Yamamoto et 
al studied Rb and its related kinases, cdk2 and cdc2, by Western blot and found increased levels 
of cdk2/cdc2, as well as hyperphosporylated form of pRB in colorectal carcinoma. Furthermore, 
immunohistQchemical studies showed that cdc2/cdc2 was expressed exclusively in cancer cells 
positive for pRB. These results suggest that an increase in the expression of cdk2/cdc2 in 
colorectal cancer may have prevented pRB from braking the cell cycle through phosphorylation. 
(Y amamoto, H., et al., Coexpression of cdk2/cdc2 and retinoblastoma gene products in 
colorectal cancer. Br J Cancer, 1995. 71(6): p. 1231-6). 

S100A4 

[0435] This is a calcium-binding protein and has been implied to be involved in cell 
immortalization, cell growth, differentiation of mammary epithelial stem cells to myoepitheUal- 
like cells, and fibrogenesis. In addition, S100A4 has been reported to be specifically expressed 
in metastatic tumor cells. Takenaga et al observed 44% of focal carcinomas and 94% of 
adenocarcinomas were immunopositive Avhile none of the adenomas were positive. Interestingly, 
the incidence of immunopositive cells increased according to the depth pf invasion, and nearly all 
of the carcinoma cells in 14 metastases of the liver were positive. These results isuggest that 
S 100A4 is a good marker to differentiate adenoma from adenocarcinoma and may be involved in 
the progression and metastatic process of colorectal neoplastic cells. (Takenaga, K., et al.. 
Increased expression of SI 00A4, a metastasis-associated gene, in human colorectal 
adenocarcinomas. Clin Cancer Res, 1997. 3(12 Pt 1): p. 2309-16). 

Y-box Binding Protein (YB-1) 
[0456] The Y-box binding protein (YB-1) is a member of a family of DNA binding proteins 
that contain la highly conserved, cold shock domain and interacts with inverted CCAAT boxes (Y 
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. boxes). YB-1 is expressed in a wide range of cell types and has been implicated in the regulation 
of various genes involved in cell proliferation. It is also overexpressed in cisp latin-resistant 
cancer cell lines, suggesting that YB-1 may be mvolved in either DNA repair or DNA damage 

: response, in addition to its role as a transcription factor. Shibao et al showed YB-1 was 
overexpressed in almost all cases of colorectal carcinomas compared with nbrrnal mucosa. 
(Sliibao, K., et al., Enhanced coexpression of YB-I and DNA topoisomerase II alpha genes in 
human colorectal carcinomas. Int J Cancer, 1999. 83(6): p. 732-7). 

Utility of Colorectal Carcinoma Panel 
[0457] The colorectal carcinoma panel contains not only markers for early detection of 
colorectal carcinoma but also markers for assessing metastatic potential and prognosis. For 
example, positive CEA (carcinoembryonic antigen) reaction has a significant relationship with 
the grade of differentiation of colorectal carcinoma while diffiise cellular expression of this 
antigen often indicates neoplasms extending beyond the intestinal wall and invading the lymph 
vessels. The number of tissue antigens expressed is significantly related to the extent of tumor 
spread through the mtestinal wall (Lorenzi, M., et al., Histopathological and prognostic 
evaluation of immunohistochemical findings in colorectal cancer, Int J Biol Markers, 1997. 
12(2): p. 68-74). Also, senim CEA levels and the expression of p53 proteins provide 
complementary prognostic information for colorectal cancer. Positive immunostaining of p53 
and elevated CEA levels are associated with low cumulative disease free survival and have been 
shown to have independent prognostic significance (Diez, M., et al. Time-dependency of the 
prognostic effect of carcinoembryonic antigen and p53 protein in colorectal adenocarcinoma. 
Cancer, 2000. 88(1): p. 35-41). Nasierowska-Guttmejer and associates (Nasierowska-Guttmejer, 
A., The comparison of immunohistochemical proliferation and apoptosis markers in rectal 
carcinoma treated surgically or by preoperative radio-chemotherapy, Pol J Pathol, 2001. 52(1- 
2): p. 53-61) have shown that low expression of Ki-67 and high levels of Bax expression are 
correlated with the total, or near-total, response of colorectal cancer to tlie treatment and 
regression of the tumor mass. However, less than two-thirds of the cases are correlated with low 
expression of p53, MIBl, bax and bcl-2. Another study showed that higher p53 and Ki67 values 
were associated with prognostically poor histopathologic features (Saleh, H.A., H. Jackson, and 
M. Banerjee, Immunohistochemical expression ofbcl-l and p53 oncoproteins: correlation with 
Ki67 proliferation index and prognostic histopathologic parameters in colorectal neoplasia. 
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Appl Immunohistochem Mol Morphol, 2000. 8(3): p. 175-82). Additionally, expression of 
cyclin Dl, CD44v6, and matrilysin (MMP-7) in colorectal cancer has been shown to.be 
correlated with high recurrent rates, reduced relapse-free and overall poor survival (McKay, J.A., 
et al,, Analysis of key cell-cycle checkpoint proteins in colorectal tumours, J Pathol, 2002. 
196(4): p. 386-93; Ropponen, K.M., et al., Expression of CD44 and variant proteins in human 
colorectal cancer and its relevance for prognosis. Scand J Gastroenterol, 1998. 33(3): p. 301-9; 
Bhatavdekar, J.M,, et al., Molecular markers are predictors of recurrence and survival in 
patients with Dukes B and Dukes C colorectal adenocarcinoma: Dis Colon Rectum, 2001 . 44(4): 
p. 523-3; and Adachi, Y., et al:, Clinicopathologic and prognostic significance of matrilysin 
expression at the invasive front in human colorectal cancers. Int J Cancer, 2001. 95(5): p. 290-4). 
Studies also demonstrated that increased expression of p53, MMP-7, P-catenin and reduced 
expression pf E-Cadherin in colorectal carcinomas were associated with an increase in their 
metastatic potential (Zeng, Z.S,, et al., Matrix metalloproteinase-? expression in colorectal 
cancer liver metastases: evidence for involvement ofMMP-l activation in human cancer 
metastases. Clin Cancer Res, 2002. 8(1): p. 144-8; Ikeguchi, M., et al., Reduced E-cadherin 
expression and enlargement of cancer nuclei strongly correlate with hematogenic metastasis in 
colorectal adenocarcinoma. Scand J Gastroenterol, 2000. 35(8): p. 839-46; Sory, A., et al, Does 
p53 overexpression cause metastases in early invasive colorectal adenocarcinoma? Eur J Surg, 
1997. 163(9): p. 685-92; and Hiscox, S. and W.G, Jiang, Expression of E-cadherin, alpha, beta 
and gamma-catenin in human colorectal cancer. Anticancer Res, 1997. 17(2B): p. 1349-54). 
Thus, colorectal panels will have diagnostic and prognostic value and provide patient risk 
stratification to guide in clinical therapy. 

Preferred Probes/Markers 
[0438] A preferred panel for detecting and/or diagnosis colorectal carcinoma comprises one or 
more tumor markers listed above. A more preferred panel for detecting and/or diagnosing 
colorectal carcinoma comprises one or more tumor markers selected from P-catenin, E-cadherin, 
hMSH2, hMLHl, p53, and cytokeratin 20. Virtually all of the colorectal ckrciuomas exhibit 
genetic alterations, thus providing an opportimity for early detection, E-Cadherin and P-catenin 
are intimately interacting with the APC (adenomatous polyposis Coli) tixmor-suppresspr gene. A 
defect in the APC gene is associated with FAP (familial adenomatous polyposis) and Gardner 
syndrome, which have very high incidence of colorectal cancer. Another genetic alteration in 
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colorectal carcinoma involves the DNA repair genes. Two important mismatch repair genes, 
hMSH2 and hMLHl, are responsible for "proof-reading" during DNA replication. Another 
marker, p53, is located at chromosome 17 and more than 70% of colorectal cancers have losses at 
/chromosome 17p, Cytolceratin 20 (CK20) is consistently expressed in colorectal carcinoma and. 
when coupled with negative cytolceratin 7 (CK7) staining, the combination (CK20+/CK7-) is 
highly specific for colorectal carcinoma (Chu, P., E. Wu, and L.M. Weiss, Cytokeratin 7 and 
cytolceratin 20 expression in- epithelial neoplasms: a survey of 435 cases. Mod Pathol, 2000. 
13(9): p. 962-72). The importance of these tumor markers with respect to colorectal carcinoma is 
detailed below. 

p-catenin 

[0459] The inherited defect underlying familial adenomatous polyposis (FAP) and Gardner 
syndromes has been mapped to 5q2 1, site of the APC tumor-suppressor gene. APC protein binds 
to cytoskeletal protein P-catenin in a cellular adhesion molecular complex, which includes 
intercellular adhesion molecule E-cadherin. P-catenin can also act as an oncogene. When it is 
not bound to E-cadherin (thus participating in cell-cell adhesion), p-catenin binds to a t-family of 
protein partners known as T cell factor-lymphoid enhancer factor (Tcf-Lef) proteins, which 
activate other genes. Genes activated by this P-catenin:Tcf complex are thought to incLude those 
stimulating cell proliferation and inhibiting apoptosis. APC binding to P-catenin directs toward 
degradation, thereby inhibiting the p-catenin:Tcf signaling pathway. Mutations in the APC gene 
reduce the affinity of APC protein for P-catenin, leading to loss of intercellular contact on the one 
hand and an increased cytoplasmic pool of P-catenin on the other. The resultant enhancement of 
Tcf-mediated cell proliferation initiates a sequence of events that predisposes to the development 
of carcinoma (Peifer, M., Beta-catenin as oncogene: the smoking gun. Science, 1997. 275(5307): 
p. 1752-3), Hence, APC is regarded as a "gatekeeper" gene. Mutations in APC underUe FAP, 
and are early events in the evolution of sporadic colon cancer, witli mutations being found in- 
85% of colorectal carcinomas (Crawford, J.M., The Gastrointestinal Tract, m Robbins 
Pathologic Basis of Disease, R.S.e.a. Cotran, Editor. 1999, W. B. Saunders Company: 
Philadelphiaj p. 775-843). Notably, most of the tumors without mutations in APC show 
mutations in P-catenin. 
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hMSH2aadhMLHl 

[0460] Inherited mutations in any of the genes that are involved in DNA repair are ptitatively 
responsible for the famiUal syndrome of HNPCC. These mismatch repair genes, hMSH2, 
hMLHl are involved. in genetic "proof-reading" during DNA replication, and hence are referred 
to as "caretaker" genes. There are 50,000 to 100,000 dinucleotide repeat sequences in the human 
genome, and mutations in mismatch repair genes can be detected by the presence of widespread 
alterations in these repeats. This is referred to as microsatellite instability. Patients who inherit a 
mutant DNA repair gene have normal repair activity because of the remaining normal allele. 
However, cells in some organs (colon, stomach, endometrium) are susceptible to a second, 
somatic mutation that inactivates the wild-type allele. Mutation rates up to 1000 times noraial 
ensue, such that most of the HNPCC tumors show microsatellite instability. Mutation of these 
. genes has been shown to be involved in the early development of gastrointestinal cancer 
(Kapiteijn, E., et al., Mechanisms of oncogenesis in colon versus rectal cancer, J Pathol, 2001. 
195(2): p. 171-8; Bukholin, LK. and J.M: Nesland, Protein expression ofp53, p21 (WAFI/CIPIX 
bcl-2. Box, cyclinDl andpRb in human colon carcinomas, Virchows Arch, 2000. 436(3): p. 
224-8). 

■p53 

[0461] Losses at chromosome 17p have been found in 70 to 80% of colon cancers. These 
chromosomal deletions affect the p53 gene, suggesting that mutations m p53 occur late in colon 
carcinogenesis. Also well known, p53 plays a critical role in cell cycle regulation. A multi-hit 
concept for colon cancer carcinogenesis is hypothesized. APC mutations are usually the earliest 
and possibly the initiating event in about 80% of sporadic colon cancers, with a less jfrequent 
contribution from mutations in mis-match repair genes. During the ensuing progression from 
adenoma to carcinoma, additional mutations ensue, such as late mutations or loss of 
heterozygosity (LOH) at p53 on chromosome 17p and in die DCC region on chromosome 18q. 
Cumulative alterations in the genome thus lead to progressive increases in size, level of 
dysplasia, and invasive potential of neoplastic lesions . (Crawford, J.M,, The Gastrointestinal 
Tract, in Robbins Pathologic Basis of Disease, R.S.e.a. Cotran, Editor. 1999, W. B. Saunders 
Company: Philadelphia, p. 775-843). 
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Cytokeratm20(CK20) 
[0462] CK 20 is a low molecular weight, intermediate filament. It is of particular interest 
because of its restricted range of expression. CK 20 is consistently expressed in normal and 
/.malignant epithelia. Expression is restricted to the gastric and intestinal epithelium and Mericel. 
cells. Studies surveying hundreds of epidieUal neoplasms from various organ systems by 
immunohistochemistry techniques demonstrated that virtually all cases of colorectal carcinomas 
are CK 20 positive (Chu, P., E. Wu, and L.M, Weiss, Cytokeratin 7 and cytokeratin 20 
expression in epithelial neoplasms: a survey of 435 cases. Mod Patliol, 2000. 13(9): p. 962-72). 
The combination of CK 20/CK 7 immuhoprofile is particularly usefiil in identifying the primary 
site of metastatic tumor in cytologic specimens. CK 20+/CK 7- is only observed in cell blocks in 
which colorectal was the primary site (Blumenfeld, W., et al, Utility of cytokeratin 7 and 20 
subset analysis as an aid in the identification of primary site of origin of malignancy in cytologic 
specimens. Diagn Cytopathol, 1999. 20(2): p. 63-6). Ascoli.and associates (Ascoli, V., et al.. 
Utility of cytokeratin 20 in identifying the origin of metastatic carcinomas in effusions. Diagn 
Cytopathol,^ 1995. 12(4): p. 303-8) determined that CK 20 expression by immunohistochemistry 
was consistently seen in malignant effusions from colonic origin. 

[0463] The following abbreviations were used throughout this example: APC = Adenomatous 
Polyposis CoU, AR = Amphiregulin, BGP = Brain-type glycogen phosphorylase, CD44v6 = 
CD44 Splice Variant 6, CEA = Carcinoembryonic Antigen, CFLff = Cellular FLICE-like 
inhibitory protein, CK ^ Cytokeratin, COX = Cyclooxygenase, CR-I = Cripto, EGFR = 
Epidermal Growth Factor Receptor, EphB = Erythropoietin-Producing Amplified Sequence, 
FasL = Fas Ligand, FQETr* F^cal Occult Blood Test, HNPCC = Hereditary Nonpolyposis Colon 
Cancer, ICC = Iinmunocytochemistry, IHC = Immunohistochemistry, MLH = Human Mismatch 
Repair Genes, MMP = Matrilysin (MMP-7), MSH = Human Mismatch Repair Genes, PLAP = 
Placental Alkaline Phosphatase, Rb = Retinoblastoma and YB = Y-box Binding Protein. 

m. Bladder Cancer 
[0464] Neoplasms of the bladder pose biological and clinical challenges. The incidence of 
these epithelial tumors in the United States has been steadily increasing during the past few years 
and now amounts to more tlian 50,000 new cases annually. Despite improvements in detection 
and management of these neoplasms, the death toll remains, at about 10,000 annually (Crawford, 
J.M. and R.S. Cotran, The Lower Urinary Tract, kiRobbins Pathologic Basis of Disease, T. 
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Collins, Editor. 1999, VV. B. Saunders Company: Philadelphia.' p. 1003-1008), Currently, no 
reliable methods exist to screen and detect bladder cancer. Therefore, better methods to. detect 
bladder tumors at early stage are necessary to decrease morbidity and rnortality. 
[0465] Over 95% of bladder tumors are epithelial in origin, with the remainder being 
mesenchymal tumors. Approximately 90% of epithelial tumors are composed of transitional 
cells and are called transitional cell carcinoma (TCC), while the remaining 10% are squamous 
and glandular carcinomas. TCC is the fifth most common malignancy in the U.S. and second 
most common genitourinary carcinoma. TCC is segregated into two major categories: low-grade 
and high-grade. Low-grade TCCs are always papillary, noninvasive lesions that recapitulate 
normal transitional epithelium. Tliese cases have an excellent prognosis. High-grade TCCs may 
be papillary, nodular, or both and exhibit considerable cellular pleomorphism and anaplasia. 
They account for 50% of bladder tumors, have metastatic potential, and are lethal in 60% of 
cases within 10 years of the diagnosis (Crawford, J.M. and R.S. Cotran, The Lower Urinary 
Tract, in Robbins Pathologic Basis of Disease, T. Collins, Editor. 1999, W. B. Saunders 
Company: Philadelphia, p. 1003-1008). 

[0466] Urinary cytology is a conventional screening method and provides useful diagnostic 
information for high-grade bladder tumors (Koss, L.G., et al., Diagnostic value of cytology of 

. voided urine. Acta Cytol, 1985. 29(5): p. 810-6). However, because of the lack of morphologic 
alterations in low-grade tumor^, its efficacy in detection of low-grade papillary transitional cell 
carcinoma (TCC) is less reliable (Busch, C, et al., Malignancy grading of epithelial bladder . 

• tumours. Reproducibility of grading and comparison between forceps biopsy, aspiration biopsy 
and exfoliative cytology, Scand J Urol Nephrol, 1977. 1 1(2): p. 143-8; Murphy, W.M., et al.. 
Urinary cytology and bladder cancer. The celhdar features of transitional cell neoplasms. 
Cancer, 1984. 53(7): p. 1555-65; Shenoy, U.A., T.V. Colby, and G.B. Schumann, Reliability of 
urinary cytodiagnosis in urothelial neoplasms. Cancer, 1985. 56(8): p. 2041-5), Cystoscopy is 
invasive and bothersome to the patient. Exophytic tumors are reliably diagnosed by cystoscopy 
but flat TCC, particularly carcinoma in situ, remains an endoscopic dilemma (Lin, S., et al., 
Cytokeratin 20 as an immunocytochemical marker for detection of urothelial carcinoma in 
atypical cytology: preliminary retrospective study on archived urine slides. Cancer .Detect Prev, 
2001. 25(2): p. 202-9; Heicappell, R., et al. Evaluation of urinary bladder cancer antigen as a 
marker for diagnosis of transitional cell carcinoma of the urinary bladder, Scand J Clin Lab 
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Invest, 2000. 60(4): p. 275-82). Typically, both urinary cytology and cystoscopy are used 
together with biopsy, when necessary, to optimize diagnostic sensitivity. These tests, alone or 
combined, are insufficient for early detection or assessment of recurrence or disease progression. 
: . [0467] Immunocytochemistry (ICC) is gaining in popularity because it is aoninvasive and has 
increased sensitivity compared to urine cytology. Urine tests for nuclear matrix proteins 
(NMP22) and human complement factor H-reiated proteins (BTA stat) have been on the market . 
for several years, although they have had limited impact on the use of cystoscopy. NMP22 is a . 
more sensitive test than BTA stat, but they both suffer JBrom insufficient specificity and a: false- 
positive rate that is problematic (Ramakumar, S., et al.. Comparison of screening methods in the 
detection of bladder cancer, J Urol, 1999. 161(2): p. 388-94; Ross, J.S. and M.B. Cohen,' 
Detecting recurrent bladder cancer: new methods and biomarkers. Expert Rev Mol Diagn, 2001, 
1(1): p. 39-52). However, FDA recently cleared an improved NMP22 assay, ImmunoCyt, from 
Matritech. This new assay contains two antibodies and works like the home pregnancy test. Its 
claims include a lower false-positive rate because the assay is not affected by the presence, of 
blood in urine, and increased diagnostic accuracy in conjunction with cystoscopy. ImmunoCyt is 
a cocktail of three tumor markers labeled with fluorescent markers. It recognizes a mucin 
glycoprotein and a form of carcinoembronyonic antigen (CEA)-expressed by tumor cells in the 
bladder. However, it is used for monitoring, not screening, of recurrent bladder cancer and needs 
fluorescence microscopy for viewing (Mian, C, et al., Immunocyt: a new tool for detecting 
transitional cell cancer of the urinary tract. J Urol, 1999. 161(5): p. 1486-9). 
[0468] Immunohistochemistry (IHC) is the most widely used evaluation method of bladder 
cancer for clinical urologists. Most tumor* markers that have been studie[d and*merit a role in the 
clinical decision-making process for bladder cancer have evolved from the application of IHC 
(Williams, S.G., M. Buscarini, and J.P. Stein, Molecular markers for diagnosis, staging, and 
prognosis of bladder cancer. Oncology (Huntingt), 2001. 15(11): p. 1461-70, 1473-4, 1476; 
discussion 1476-84). Unfortimately, none of the tumor markers for detecting bladder cancer is a 
"magic bullet" with both high sensitivity and specificity. Alternative ways to enhance diagnostic 
accuracy are necessary. 

[0469] An alternative way to enhance diagnostic accuracy is to develop a panel comprising a 
plurality of probes each of which specifically binds a marker associated with bladder cancer, 
Each candidate probe is to be tested by IHC or ICC. For ICC, the specimen will often be a urine 
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sample. The tradeoff of doing ICC, rather than IHC, is that urine cytological specimens usually 
have fewer cells than tissue sections. In some embodiments, high quality monoclonal or 
polyclonal antibodies may be used to assure assay good performance. In other embodiments, 
patients who are as close as possible in age, gender, diagnosis, tumor grade, tumor size, clinical 
stage, etc. may be studied. In further embodiments, the same patient may have urine collections 
as often as medically indicated and possible. Additionally, slides can be made by spiking tumor . 
cells into cell^suspension and testing before actual patient specimens are tested, 
[0470] The specimen, either formalin-fixed paraffm-embedded (FFPE) tissue for IHC or urine 
cytology for ICC, will be obtained jfrom medical institutions. Once the specimens. are collected, 
the specimens will be processed and analyzed. Statistical analysis will be used to design panels, 
as described above for lung cancer. During processing, technical issues such as cell smears or 
pellets not sticking to slides during .harsh washings may occur in some embodiments. However, 
such issues can readily be. addressed by manipulation.of software or modifying staining protocols 
to mitigate such problems. In some embodiments, the specimens will be processed and analyzed 
using a device that automatically samples the specimen and prepares monolayer slides for cyto- 
interpretation or diagnosis. 

[0471] It is anticipated that a broad menu of probes will be used initially. The number of 
probes will be pruned to a suitably sized panel in order to retain a high level of sensitivity and 
specificity. Selecdon of the final probes will be based on a pre-defined threshold of the 
percentage of positive stained tumor cells. Sophisticated statistical analysis will be employed to 
make these determinations. Since the panel-assay approach to detecting malignancies is. 
applicable to solid tumors, and several of the same tumor markers aire in different panels, this 
method may be carried out iii parallel, as well as serially. In this manner, the assay development 
process can be expedited. 

[0472] The initial probes will he pnmed to a suitably sized panel with high sensitivity and 
specificity. The selection of final probes is based on a pre-defined threshold of a percentage of 
positive stained tumor cells. Once a final panel of tests using IHC is determined, specimens will 
be tested by ICC. The established ICC probes will be tested on urine specimens as a panel. .In 
some embodiments, automated staining will be employed, therefore, standardization can be 
achieved and results uiterpretation will be more consistent. 
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[0473] Compared with lung cancers, which have five subtypes (adenocarcinoma, squamous cell 
carcinoma, small cell carcinoma, large cell carcinoma, mesothelioma), over 90% of bladder 
■cancer is categorized as transitional cell carcinoma (TCC). This allows the bladder tumor panel, 
to be more specific at targeting one type of cancer. In addition, detection and diagnosis of 
bladder tumor can be FFPE-based and/or cell-based (urinary cytology). 

Library of Probes/Markers 
[0474] Various published scientific sources containing information about cancer markers were 
revievved. An arbitrary criterion of 20% or greater positivity of bladder cancer was used to select 
probes for a preferred panel for detection and/or diagnosis of bladder cancer. The term '"20% or 
greater positivity" means that if 100 tumor cases were studied, 20 or more o f these cases would 
have shown a presence of the individual marker, while the remaining 80 or fewer cases would 
not have shown a presence of the individual marker. A preferred panel may include markers 
selected from BL2-10D1, C-erbB-2, CD44s Standard, Splice Variant CD44v6, Splice Variant 
CD44v3, Caveolin-1, Collagenase, CyclinDl, Cyclooxygenase-1 (COX-1), Cyclooxygenase-2 
(COX-2), Cytokeratin 20 (eK20). E-cadherin, Epidermal Growth Factor Receptor (EGFR), Heat 
Shock Protein-90 (HSP-90), IL-6,.iL-10, HLA-DR, Hiunan Mis-Match Repair Gene (hMSH2), 
Lewis X, MDM2, Nuclear Matrix Protein 22 (NMP-22), p53 , PCNA, MB 1 (Ki-67), 
Retinoblastoma (Rb), Survivin, Transforming Growth Factor-pi, Transiforming Growth Factor- 
pl Receptor I, Transforming Growth Factor-(31 Receptor 11 and UBC (CK8 and CK18). A brief 
description of the library of probes/markers utilized in the present example is, provided below. 

BL2-10D1 

[0475] This is an IgM antibody. A hypbridoma cell line secreting an IgM monoclonal antibody 
was produced after immunizing a mouse withRT4 cells and a suspension of human bladder 
carcinoma cells (Longin, A., et al, A monoclonal antibody (BL2-10D1) reacting with a bladder- 
cancer-associated 'antigen, Int J Cancer, 1989: 43(2): p. 183-9). It shows a strong reactivity with 
bladder. tumors but not with normal urotheliiun except 5% to 10% of umbrella cells. This 
antibody reacts with most of the papillary Grade 1 and Grade 2 TCCs and with carcinoma in situ, 
whereas papillary Grade 3 and invasive non-papillary TCC show poor reactivity. Longin 
(Longin, A., et al., A useful monoclonal antibody (BL2-10D1) to identify tumor cells in urine 
cytology. Cancer, 1990. 65(6): p. 1412-7) and associates demonstrated that all urine from patients 
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with a low-grade (Gl) tumors was stained with BL2-10D1. Grades 2 or 3 were not always 
stained. Such results suggest that BL2-10D1 may be considered a valuable marker of early 
detection of bladder cancer. 
. C-erbB-2 

[0476] The c-erbB-2 gene encodes a transmembrane tyrosine Icinase that is the receptor for a ' • 
family of peptide homones, C-erbB-2 amplification has been found in transitional cell 
carcinomas. Previous studies have observed an association of c-erbB-2 with metastasis, as well 
as with tumor grade or stage (Ioachim,,E., et al., Immunohistochemical expression of . 
retinoblastoma gene product (Kb), p53 protein, MDM2, c-erbB-l, HLA-DR and proliferation 
indices in human urinary bladder carcinoma. Histol Histopathol, 2000. 15(3): p. 721-7). 

CD44 Standard (CD44s) and Splice Variants CD44v6 and CD44v3 . 
[0477] CD44 is a transmembrane cell surface receptor. It has been associated with diverse 
flmctions, including cell-to-cell adhesion, cell matrix interaction, and tumor metastasis. The 
significance of CD44 isofornis in tumor development and its progression has been reported in 
various, tumors: In a study by Masuda et al (Masuda, M., et al, Expression and prognostic value 
ofCD44 isoforms in transitional cell carcinoma of renal pelvis and ureter. J Urol, 1999. 161(3): 
p. 805-8; discussion 808-9), expression of CD44s, CD44v6 and CD44v3 was significantly 
decreased in relation to histologic grade of bladder cancer. However, all of these isoforms were 
expressed strongly on the cytoplasmic membrane of basal cells of normal urotheUal mucosa. 
However, the superficial layers of normal urothelial mucosa did not express them. 

CaveoUn-1 

[0478] Caveolae are abundant in numerous cell types, ranging from adipocytes and endothelial 
cells to type I pneumocytes and skeletal muscle cells. Three constituent caveolin protein family 
members have been identified, caveolin-1, caveolin-2 and caveolin-3. CAV-1 gene has been 
mapped to chromosome 7q3 1 and has much scientific interest as a potential site of tumor 
sujppressor activity. It presumably involves signal transduction by interacting with a broad range 
. of signal transducing molecules and receptors (Src, G protein, and EGFR). Rajjayabim (loachim, 
E., et al., Immunohistochemical expression of retinoblastoma gene product (Rb), p53 protein, 
MDM2, c-erbB-2, HLA-DR and proliferation indices in human urinary bladder carcinoma. 
Histol Histopathol, 2000. 15(3): p. 721-7) fu^st showed CAV-1 immunbreactivity in high-grade 
bladder cancers. 
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Collagenase . 

[0479] Collagenase is known to dissolve collagen and repair and maintain tissues. It is secreted 
from epithelial cells, neutrophils, histiocytes and fibroblasts. Increased expression of collagenase 
has been associated with breast and thyroid cancer. RT-PCR study showed increased mJRNA in 
urothelial carcinoma and one study demonstrated 34% and 45% of patients with TCC showed 
positive expression of collagenase on cytologic and histologic specimens, respectively. No 
expression was found on benign lesions (Hattori, M., E. Ohno, and H. Kuramoto, 
Immunocytochemistry of collagenase expression in transitional cell carcinoma of the bladder. 
Acta Cytol, 2000. 44(5): p. 771^7). 
CyclinDl 

[0480] Cyclm D 1 gene product contributes to the regulation of the Gl/S-phase transition of cell 
cycle and is a candidate oncogen. It has been shown to correlate with low-grade, low^stage and 
papillary tumor growth in primary bladder carcinomas and it has beeii suggested to play on 
important role in bladder cancer progression (Byrne, R.R., et al., E-cadherin immunostcdning of 
bladder transitional cell carcinoma, carcinoma in situ and lymph node metastases with long- 
/erm/o//ow«/?. J Urol, 2001; 165(5): p. 1473-9). . . 

Cyclooxygenase-1 (Cox-1) and Cyclooxygenase-2 (Cox-2) 
[0481] Cyclooxygenases are the rate-limiting enzymes catalyzing the initial step in the 
formation of prostaglandins that are involved in inflaionmation, immune responses, mitogenesis 
and apoptosis, Cyclooxygenase-1 (Cox- 1) is constitutively expressed in most tissues at a rather 
stable level. The low basal activity of the inducible form, cyclooxygenase-2 (Cox-2), is 
increased during inflammatory processes by cytokines, growth factors, oncogenes and tumor 
promoters. Increased cyclooxygenase activity and, consequently, elevated prostaglandin levels 
have been observed hi gastroenterological maUgnancies, as well as bladder cancer. Bostrom et al 
(Bostrom, P.J., et al. Expression of cyclooxygenase- 1 and -2 in urinary bladder carcinomas in 
vivo and in vitro and prostaglandin E2 synthesis in cultured bladder cancer cells. Pathology, 
2001. 33(4): p. 469-74) showed difRise, moderate to strong, cytoplasmic inmiunosignal for Cox- 
2 that was detected in all 29 TCCs. of bladder. Normal urothelium in the specimen also stained 
for Cox;-2, but the intensity was weak. Cox-1 staining was detected in tissue from 18 of 29 TCC 
specimens (62%) but the signal was weak in 16 of the 18 specimens. Immunosignal from Cox-1 
was detected only in a few normal specimens and was weak to moderate. 
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Cytokeratm20(CK20) 
[0482] Cytokeratin 20 (CK 20), one of 20 known cytokeratins, is a constituent of the 
intermediate filaments of epithelial cells. IHC study has shown CK 20 was expressed in 
urothelial. cells of patients with urothelial carcinoma or urothelial dysplasia. In nonnal 
urothelium, CK 20 expression was restricted on superficial umbrella cells. By using ICC, Lin 
(Lin, S., et al., Cytokeratin 20 as an immunocytochemical marker for detection of urothelial 
carcinoma in atypical cytology: preliniinary retrospective study on archived urine slides. Cancer 
Detect Prev, 2001. 25(2): p. 202-9) and associates showed CK 20 was positive in 95% of bladder 
cancer patients and only positive in 10% of normal control. 

E-cadherin 

[0483] E-cadherin is expressed in all epithelial tissue and is found on the plasma membrane of 
squamous and transitional cells. E-cadherin mediated cell adhesion is involved in tiraior 
progression and metastasis. IHC studies of E-cadherin in transitional cell carcinoma of the 
bladder have demonstrated a significant association of abeixant E-cadherin expression with 
advanced tumor stage and loss of differentiation: Byrne et al (Byrne, R.R., et al., E-cadherin 
immunostaining of bladder transitional cell carcinoma, carcinoma in situ and lymph node 
metastases with long-term followup, J Urol, 200 L. 165(5): p. 1473-9) showed 59 (77%) bladder 
tumors had loss of normal membrane E-cadherin, >yhereas preserved E-cadherin expression was 
seen in normal urothelium. 

Epidermal Growth Factor Receptor (EGFR) 
[0484] Epidermal growth factor is a potent mitogen and its actions are mediated by binding to 
the external domain of epidermal growth factor receptor (EFGR), EGFR is a transmembrane 
protein receptor with tyrosine kinase activity. The cytoplasmic and internal domains of EGFR 
have close similarity with the oncogene product of the avian erytbroblastosisi vims (v-erb-B-2). 
Increased levels of EGFR are found in solid tiunors, including squamous cell carcinoma of lung, 
head, neck, cervix, breast, prostate and bladder (Neal, D.E., et al., The epidermal growth factor 
receptor and the prognosis of bladder cancer. Cancer, 1990. 65(7): p. 1619-25). The range of 
positivity of bladder cancer is 3 1 -48% (American Cancer Society. 2000). 

Heat Shock Protein-90 (HSP-90), IL.-6 and IL-10 
[048SJ * During the early stages of bladder cancer, a cascade of inununological reactions takes 
place and various proteins, including heat shock protein (HSP-90) and cytokines, are secreted in 
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large amounts. HSP-90 is one of the most important members of the HSP family. Of the 56 
bladder carcinoma stiidied by IHC, 52 (93%) expressed HSP-90, 48 (86%) expressed IL-6 and 45 
(80%) expressed. IL- 10. High-grade and muscle-invasive tUTnprs contained significantly higher 
levels of HSP-90 and IL-6 than low-grade tumors. Normal .urothelium adjacent to tumor areas do 
not show HSP-90 staining. IL-6 and IL-10 showed scarce immunoreactivity (Cardillo, M.R., P. 
Sale, and F. Di Silverio, Heat shock protein'90; IL-6 cmclIL-lO in bladder cancer. Anticancer 
Res, 2000. 20(6B): p. 4579-83). This suggests-IL-6 and IL-10 may be turned on at a relatively , 
lov/ stage during tumor development. 
HLA-DR 

[0486] Although the functional role of HLA-DR is to mediate communication among inuiiuno- 
competent cells, it also has been shown that HLA-DR antigen expression is independent of 
lymphocyte subpopulations in bladder cancer (loachim, E., et al., Immimohistocheniical 
expression of retinoblastoma gene product (Kb), p53 protein, MDMI, c-erbB-l, HLA-DR and 
proliferation indices in human urinary bladder carcinoma. Histol Histopathol, 2000. 15(3): P- 
721-7). 

Human Mis-Match Repair Gene (hMSH2) 
[0487] This is the prototype mismatch repair gene (MMR). hMHS2 recognizes and binds to 
nucleotide mismatches and in conjunction with other MMR proteins directs the coordinated 
correction of DNA replication errors. Mutation of MMR genes has been reported in 40% of 
hereditary non-polyposis colon cancer (HNPCC) and recent IHG studies suggest that bladder 
tumors with decreased expression of hMSH2 are associated with higher grade arid recurrence. 
One study showed all bladder tumors are positive for hMSH2, whereas nonnal bladder mucosa . 
are negative by IHC (Leach, F.S., et al, Expression of the human mismatch repair gene hI/ISH2: 
a potential marker for urothelial malignancy. Cancer, 2000. 88(10): p. 2333-41). 

Lewis X 

[0488] ABH and Lewis blood group-related antigens are present on the surface of normal 
urothelium. The Lewis X antigen is normally absent from urotheUal cells in the adult, except for 
occasional umbrella cells. Sheinfeld (Sheinfeld, J., et al., Enhanced bladder cancer detection 
with the Lewis X antigen as a marker of neoplastic transformation. J Urol, 1990. 143(2): p. 285- 
8) and associates have used immunostaining of the Lewis X antigen on epithehal cells from 
bladder washing specimens for detection of bladder tumors and reported sensitivity of 86% and 
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specificity of 87%.- Golijanin (Golijanin, D., et al., Detection of bladder tumors by 
immunostaining of the Lewis X antigen in cells from voided urine. Urology, 1995, 46(2): p. 173- 
7) and associates also showed high sensitivity and specificity of Lewis X immunostaining of 
urine samples. High-grade and low-grade transitional cell tumors were detected with equal 
efficiency. 

MDM2 

[0489] This is a cellular proto-oncogene product. MDjyi2 has been shown to bind to p53 and 
acts as a negative regulator, inhibiting its transcriptional trans-activation. It has been shown that 
aberrant MDM2 and p53 phenotypes may be important diagnostic markers in bladder cancer 
patients (loachim, E., et al., Imrnunohistochemical expression of retinoblastoma gene product 
(Rb). p33 protein, MDM2, c-erbB-l, HLA-DR and proliferation indices in human urinary 
bladder carcinoma. Histol Histopathol, 2000. 15(3): p. 721-7). 

Nuclear Matrix Protein 22 (NMP-22) 
[0490] This nuclear matrix protein plays an important role in the structural framework of the 
nucleus, in DNA replication and in gene expression. Significantly increased concentrations of 
NMPs have . been found with neoplastic transformation and in carcinomas of the breast, colon and 
bladder. Soluble NMPs can be detected in the urine fi-om bladder cancers using antibodies 
against select epitopes of NMP (NMP-22). Landman et al (Landman, J., et al, Sensitivity and 
specificity of NMP''2 2, telomerase, andBTA in the detection of human bladder cancer. Urology, 
1998. 52(3): p. 398-402) foLind the overall sensitivity to be 81%. 

p53 . • 

[0491] This human tumor suppressor gene encodes a nuclear phoisphoprotein that facilitates 
DNA repair after genomic damage. Wild type p53 degrades. Mutant p53 does not and therefore 
accumulates in the cell. Mutant p53 can be detected by IHC. Several studies associate p53 
mutation with high-grade bladder cancer and unfavorable prognosis (Vollmer, R.T., et al., 
• Invasion of the bladder by transitional cell carcinoma: its relation to histologic grade and 
expression ofp53, c-erbB-2, epidermal growth factor receptor andbcl-2. Cancer, 1998. 

82(4): p. 7 1 5-23 ; Nakopoulou, L., et al., The prevalence ofbcU2, p5l and Ki'67 . 
immunoreactivity in transitional cell bladder carcinomas and their clinicopathologic correlates, . 
Hum Pathol, 1998. 29(2): p. 146-54; Vorxeuther, R., et al.. Expression of imrnunohistochemical - 
markers (PCNA, Ki-67, 486p and p53) on paraffin sections and their relation to the recurrence. 
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. rate of superficial bladder tumors. Urol Int, 1997. 59(2): p. 88-94; Li, B., et al., Reciprocal 
expression ofbcl-2 and p53 oncoproteins in urothelial dysplasia and carcinoma of the urinary 
bladder, Urol Res, 1998. 26(4): p. 235-41). The reported p5 3 mutation rate in bladder cancer 
■ varies from 30% to 58%. Wu et al (Wu, T.T., et al., The role ofbcl-2, p53, and ld'67 index in .. 
predicting tumor recurrence for low grade superficial transitional cell bladder carcinoma. J 
Urol, 2000. 163(3): p. 758-60) demonstrated 42 of 60 (70%) cases had p53 mutation. 

PCNAandMIBr(Ki-67) 
[0492] Both are cell proHferation nuclear markers. They are expressed in a variety of tumors 
including bladder cancer. Studies demonstrated a strong correlation between PCNA and MIBl, 
suggesting either of the markers could be used (loachim, E., et Immunohistochemical 
expression of retinoblastoma gene product (Rb), p53 protein, MDM2, c-erbB-I, HLA-DR and 
proliferation indices in human urinary bladder carcinoma, Histol Histopathol, 2000. 15(3): p. 
721-7; Lavezzi, A.M., L. Temi, and L. Mattiirri, PCNA immunostaining as, a valid alternative to 
tritiated thymidine-autoradiography to detect proliferative cell fraction in transitional cell . 
bladder carcinomas. In Vivo, 2000. 14(3): p. 447-51). 

Retinoblastoma (Rb) 

[0493] Rb is a tumor suppressor gene and may play a role in the initiation and progression of 
human tumors. Rb codes for nuclear phosphoproteins present in normal cells and are thought to 
be involved in cell cycle control and in negative regulation of cell growth. Alterations of the Rb 
gene have been observed in several epithelial tumors suggesting structural abnormalities, 
including mutations and/or deletions of this gene, may result in the inactivation of tumor 
suppressor protein and may be involved in tiunorigenesis. 
Siuvivin 

[0494] Survivin is a 142 amino acid protein. It is expressed in the G2/M phase of the cell cycle 
and associates with microtubules of the mitotic spindle. It is an inhibitor of apoptosis that is 
selectively over-expressed in common human cancers, but not in normal tissues, and correlates 
with aggressive disease and unfavorable outcomes (Lehner, R., et al., Immunohistochemical ' 
localization of the lAP protein survivin in bladder mucosa and transitional cell carcinoma, Appl 
Immunohistochem Mol Morphol, 2002. 10(2): p. 134-8). 
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Trahsfonning Growth Factor-P 1 , and Receptors I and II 
[0495] TGF-pi is a pleiotropic growth factor regulating cellular proliferation, differentiation, 
migration, immune response, angiogenesis and apoptosis. It has been shown to inhibit normal 
epithelial cell growth by reducing the ability of cells to enter S-phase. Conversely, many 
carcinoma cells show resistance to growth inhibition by TGF-pi. The. effects of TGF-pl are 
mediated by membrane-bound serine-threonine kinase receptors, currently referred to as TGF-pi 
receptor I (TGF-piI) and TGF-pl receptor II (TGF-piII). The loss of expression of receptors I 
and II has been found in several cancers including prostate, colon, ovarian, and bladder. In one 
study, expression of TGF-pi,TGF-p II, andTGF-plII was altered in 51 (64%), 34 (43%), and 38 
(48%) of bladder cancer cases. Over-expression of TGF-pi was seen in high-grade tumors and is . 
negative in normal bladder mucosa. Loss of TGF-piI and pill was associated with invasive 
tumor stage (Kim, J.H., et al. Predictive value of expression of transforming growth factor- 
beta(l) and its receptors in transitional cell cdrcinpnia of the urinary bladder. Cancer, 2001. 
92(6): p. 1475-83). 

UBC(GK8andCK18) 
[0496] This is an enzyme immunoassay that measures the concentration of cytokeratin 
fragments, 8 and 18, in the urine. These cytokeratins arei co-expressed by simple epitheha and 
carcinomas, among them normal urothelium and TCC. Heicappell et al demonstrated 
sensitivities of UBC to range from 22% and 75% with the overall specificities of 77% 
(Heicappell. .R., et ah, Evaluation of urinary bladder cancer antigen as a marker for diagnosis of 
transitional cell carcinoma.of the urinary bladder Scand J Clin Lab Invest, 2000. 60(4): p. 275-. 
82). 

Tissue-Based Markers 
[0497] A tissue-based panel for detecting and/or diagnosing bladder cancer includes diagnostic 
arid prognostic markers, as well as those associated with metastatic potential, turapr grade and 
stage determinations. It has potential value not only in patient screening and diagnosis, but also 
in patient stratification, to guide clinical therapy. Tissue-based panels can also be tested on urine 
specimens to see whether the results are superior to the existing cell-based panel. A preferred 
tissue-based panel for detecting and/or diagnosing bladder cancer comprises markers selected 
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. from p53, Rb, MDM2, EGFR and survivin. The importance of these tiimor markers with respect 
to bladder cancer is detailed below! 

[0498J Detection of high-grade bladder cancer usually is not a problem because severe 
cytological atypia is present. On the other hand, detection of low-grade tumors is a continuing .. 
challenge for the cytopathologist. Understanding the patliogenesis of a bladder tumor is cmcial 
to pinpointing the earliest events during tumorigenesis. Cigarette smoking, exposure to 
arylamines, long-term use of analgesics, and exposure to cyclophosphamide increase the risk of 
bladder cancer. How these influences induce cancer is unclear, but a number of genetic 
alterations have been observed in transitional cell carcinoma. The evidence suggests 
chromosome deletions involving tumor-suppressor genes and subsequent loss of cell-cycle 
control are important in the early development of bladder tumor (Crawford, J.M. and R.S. 
Cotran, The Lower Urinary Tracts in Robbins Pathologic Basis of Disease^ T. Collins, Editor, 
1999, W. B. Saunders Company: Philadelphia, p. 1003-1008; Williams, S.p., M. Buscarini, and 
J.P. Stein, Molecular markers for diagnosis, staging, and prognosis of bladder cancer. Oncology 
(Huntingt), 2001. 15(1 1): p. 1461-70, 1473-4, 1476; discussion 1476-84). 
[0499] Between 30% to 60% of bladder tumors have chromosome 9 monosomy or deletions of 
9p and 9q, as well as deletions of 17p, 13q, 1 Ip and 14q (Gibas, Z. and L. Gibas, Cytogenetics of 
bladder cancer. Cancer Genet Cytogenet, 1997. 95(1): p. 108-15). Chromosome 9 deletions are 
the orily genetic changes frequently present in superficial papillary tumors and occasionally in 
noninvasive flat tumors. The 9p deletion, 9p2l, involves tumor-suppressor gene pi 6. However, 
studies demonstrate pi 6 protein expression is not decreased in 9p deletions. On the other hand, 
many invasive transitional cell carcinomas show deletions of 17p and mutations in the p53 gene, 
suggesting that alterations in p53 contribute to the progression of transitional cell carcinoma. 
[0500] Mutations in p53 are also found in flat in situ bladder cancer. MDM2, a proto- 
oncogene, plays a pivotal role in regulation of the cell cycle by binding with p53, MDM2 can 
also interact with other critical elements of the cell cycle and apoptotic regulatory controls, such 
as E2F and Rb (JVIartin, K., et al., Stimulation ofE2Fl/DPl transcriptional activity byMDM2 
oncoprotein. Nature, 1995. 375(6533): p. 691-4; Xiao, Z.X., et al,. Interaction between the 
retinoblastoma protein and the oncoprotein MDM2, Nature, 1995. 375(6533): p. 694-8). The . 
13q deletion is that of the retinoblastoma gene (Rb) and mutation of the Rb gene is considered to 
be of central importance in the pathogenesis of many human malignant neoplasms. Recently, it 
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has become clear that the genomic loss of pRb leads to uncontrolled cell growth and apoptosis • 
(i.e., cell death). Previous studies in bladder cancer have reported that inactivation of Rb is 
associated with tumor progression (loachim, E., et al, Immunohistochemical expression of 
retinoblastoma gene product (Rb), p53 protein. MDM2, c-erbB-l, HLA-DR and proliferation 
indices in human urinary bladder carcinoma. Histol Histopathol, 2000. 15(3): p. 721-7). 
[0501] The recurrent rate is high for low-grade bladder cancer. For patients without muscle 
invasion, the long-term outlook is better but more than half have recurrent tumors develop within 
the bladder that are usually similar in stage and grade to the primary cancer. EGFR has been 
shown to be associated with increased recurrent rate and the progression of superficial to invasive 
bladder cancer (Neal, D.E., et al.. The epidermal growth factor receptor and the prognosis of 
bladder cancer. Cancer, 1990. 65(7): p; 1619-25; Izawa, J.L, et al, Differential expression of 
progression-related genes in the evolution of superficial to. invasive transitional cell carcinoma 
of the bladder. Oncol Rep, 2001. 8(1): p. 9-15). . . 

[0502] Survivin is a new bladder tumor marker expressed in the G2/M phase of the. cell cycle 
and associates with microtubules of the mitotic spindle (Li, B., et ah, Reciprocal expression of 
bcl-2 and p53 oncoproteins in urothelial dysplasia and carcinoma of the urinary bladder. Urol 
Res, 1998. 26(4): p. 235-41). Disruption of survivin-micro tubule interactions results in the loss 
of survivin*s anti-apoptotic function and in activation of caspase-3. Caspase-3 is an essential 
death factor for the Fas-mediated cell death and its inactivation in cells is initiated by an 
interaction with p21 (Tamm, I./ et al, lAP family protein survivin inhibits caspase activity and 
apoptosis induced by Fas (CD95), Box, caspases, and anticancer drugs. Cancer Res, 1998. 
58(23): p. 5315-20). Survivin is alsp involved in cell-cycle control. More importantly, studies 
show that nuclear staining of survivin is present in a majority of transitional cell carcinoma and 
none in healthy bladder mucosa (Lehner, R, et al., Immunohistochemical localization of the lAP . 
protein survivin in bladder mucosa and transitional cell carcinoma. Appl Immunohistochem Mol 
Morphol, 2002. 10(2): p. 134-8). 

Cell-Based ImmunocytpchemicalMarkers 
[0503] The cell-based panel has markers specific for early stage, low-grade TCC, as well as for 
late, high-grade TCC. It can be used to screen and detect early stage, low-grade TCC. Fpr 
difficult cases, a cytdscopic biopsy can be followed to confirm tiie diagnosis with tiie tissue- • 
based panel. It can also be used to diagnose high-grade TCC if a patient presents at late stage. 
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The results can guide therapeutic decisions by urologists. A preferred cell-based panel for 
detecting and/or diagnosing bladder cancer using immunocytochemical techniques comprises 
markers selected from BL2-10D1, cytokeratin 20, Lewis X and iSnvIP-22. The importance pf 
• these tumor markers with respect to bladder cancer is detailed below. 
[0304] The introduction of novel molecular markers into clinical urology has increased the 
diagnostic accuracy of bladder cancer compared to conventional urinary cytology. A number of 
commercial products, as well as individual markers, are available. Each marker, however, 
exhibits a different sensitivity and specificity. Furthermore, no single inomunocytochemical 
assay has yet replaced invasive cystoscopy. 

[dSOS] Uniqueness is evident with several markers. For example, BL2-10D1 has a consistent 
reactivity with low-grade TCC versus its non-consistent reactivity with high-grade TCC (Longin, 
A., et al., A ttsefiil monoclonal antibody (BL2'10D1) to identify tumor cells in urine cytology. 
Cancer, 1990. 65(6): p. 1412-7). CK 20 has an overall sensitivity of 94% and specificity of 80% 
for the detection of TCC (Lin, C.W., et al., Detection of tumor cells in bladder washings by a 
monoclonal antibody to human bladder tumor-associated antigen, J Urol, 1988, 140(3): p. 672- 
7), and in benign epithelium staining is limited on the surface of umbrella cells while in 
dysplastic epithelium- the staining involves full thiclcness (Hamden, P., et al., Cytokeratin 20 as 
an objective marker of urothelial dysplasia. Br J Urol, 1996. 78(6): p. 870-5). Immunostaining 
of the Lewis X antigen on epithelial cells from bladder washing specimens found a sensitivity of 
86% and specificity of 87% (Shetnfeld, J., et al., Enhanced bladder cancer detection with the 
Lewis X antigen as a marker of neoplastic transformation. J Urol, 1990, 143(2): p. 285-8). And, 
-studies "done to compare NMP-22, telomerase, and BTA assay demonstrate that NM- 22 has the 
highest sensitivity and specificity (Landman, J., et aL, Sensitivity and specificity of NMP-22, 
telomerase, and BTA in the detection of hitman bladder cancer. Urology, 1998. 52(3): p. 398- 
402; Friedrich, M.G., et al., Clinical use of urinary markers for the detection and prognosis of 
bladder carcinoma: a comparison of immunocytology with monoclonal antibodies against Lewis 
X and 486p3/12 with the BTA STAT and NMP22 tests. J Uvol 2002. 168(2): p. 470-4). 
[0506] By combining biomarkers for one or more panel-assays, sensitivities and specificities 
■increase significantly (Eissa, S., et al., Comparative evaluation of the nuclear matrix protein, 
fibronectin, urinary bladder cancer antigen and voided urine cytology in the detection of bladder 
tumors. J Urol, 2002. 168(2): p. 465-9). An ideal bladder cancer assay should be non-invasive, 
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sensitive, specific, and cost effective. While the tissue-based panel and cell-based panels may he 
used alone, markers from a tissue-based panel and from a cell-based panel can be combined 
again to have a final optimized panel. 

[0507] The following abbreviations were used throughout this exariiple: CD44s = CD44 
Standard, CD44v6 = CD44 splice variant 6, CD44v3 = CD44 splice variant 3, CK = Cytokeratin, 
Cox-1 = Cyclooxygenase-1, Cox-2 = Cyclooxygenase-2, EGFR = Epidermal Growth Factor 
Receptor, ELIS A = Enzyme Linked Immunosorbent Assay, FFPE = Formalin Fixed Paraffin • 
Embedded , FNA = Fine Needle Aspiration, FNB = Fine Needle Biopsy, HLA-DR = Human 
Leukocyte Antigen-DR, hMSH2 = Human Mismatch Repair Gene, HNPCC = Hereditary 
Nonpolyposis Colon Cancer, HSP-90 = Heat Shock Protein-90, IL-6 = Interlukin-6. IL-IO = 
Interlukin-lOjJCC = Immunocytochemistry, IHC ^ Immunohistochemistry, ISH = In Situ 
Hybridization, NMP22= Nuclear matrix proteins, PCNA = Proliferating Cell Nuclear Antigen, 
PCR = Polymerase Chain Reaction, TCC = Transitional Cell Carcinoma, TGF = Transforming 
Growth Factor, STDs = Sexual Transmitted Diseases and Rb = Retinoblastoma. 

IV. Prostate Cancer 
[0508] A variety of prostate tumor markers have been discovered to aid physicians in making 
timely, precise diagnoses, and to provide significantly better patient management. Unfortunately, 

. none of these prostate tiunor markers is a "magic bullet" with both high sensitivity and 
specificity. Therefore, alternative ways to eiihance diagnostic accuracy are necessary. 
. [0509] An alternative way to enhance diagnostic accuracy is to develop a panel comprising a 
plurality of probes each of which specifically binds a marker associated with prostate cancer. All . 
candidate probes are to be tested with ICC and/or EHC techniques. In some embodiments, 
specimens may be- obtained firom a fine needle biopsy. Fine needle aspiration (FNA) is usually 
used for superficial and external organs, whereby a cell-suspension cytology specimen can be 
obtained, such as firom breast, thyroid or a similar soft tissue mass. In the case of deep and 
internal organs, such as prostate, Iddney or liver^ a fine needle biopsy (FNB) technique is often 
used which may be guided by ultrasound or CT, and a tissue specimen, is obtained. With the 
prostate gland, ultrasound is often used and a FNB is performed transrectally . In other 
embodiments, touch preparations (touch preps) of FNBs can be performed to generate an imprint 

. of cells on a glass slide from tissue. In the case of a fluid or a bloody specimen from a FNB 
procedure, the cell suspension specimen can be generated. 
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[0510] Once the specimens are collected, the specimens will be processed and analyzed. 
Statistical analysis will be used to design panels, as described above for lung cancer. During 
processing, technical issues such as cell smears or pellets not sticking to slides during harsh 
washings may occur in some embodiments. However, such issues can readily be addressed by 
manipulation of software or modifying staining protocols can mitigate such problems. In some 
embodiments, the specimens will be processed and analyzed using a device that automatically 
samples the specimen and prepares slides for diagnosis. It is anticipated that a broad menu of 
probes will be used initially. The number of probes will be pruned to a suitably sized panel in 
order to retain a high level of sensitivity and specificity. Selection of the final probes will be 
based on a pre-defined threshold of the percentage of positive stained tumor cells. Sophisticated 
statistical analysis will be employed to make these determinations. Since the panel-assay 
approach to detecting malignancies is applicable to solid tumors, and several of the same tumor 
markers are in different panels, this method may be carried out in parallel, as well as serially. In 
this manner, the assay development process- can be expedited. 

; Library of Probes/Markers 
[0511] Various sources containing information about cancer markers were reviewed. An 
arbitrary criterion of 20% or greater positivity of prostate cancer was used to select probes for a 
preferred panel for detection and/or diagnosis of prostate cancer/The term "20% or greater 
positivity" means that if 100 tumor cases were studied, 20 or more of these cases would have 
shown a presence of the individual marker, while the remaining 80 cases would not have shown a 
presence of the individual marker. A preferred panel may include molecular markers selected 
from 34pE12, B72.3, C-erb.B2, E-Cadherin, Epidermal Growth Factor Receptor (EGFR), Fatty 
Acid Senthase (FAS), DD-l, Kallikrein 2, Ki-67, Leu 7, MDM2, N-Cadherin, P.504S, p53, • 
Prostate Acid Phosphatase (PAP), Prostate Inhibin Peptide (PI) and Prostate Specific Antigen 
(PSA). A brief description of the library of probes/markers utilized in the present example is 
provided below. 

34pE12 

[0512] This is a high-molecular- weight cytokeratm (HMWCK) that is present in basal cells of 
the prostate. Adenocarcinoma of the prostate lacks imunoreactivity with this antibody. Yang 
and associates (Yang, X.J., et al.. Rare expression of high-molecular-weight cytokeratin in 
adenocarcinoma of the prostate gland: a study of 100 cases of metastatic and locally advanced 
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prostate cancer, .Am J Surg Pathol, 1999, 23(2): p, 147-52) studied 100 cases of metastatic and 
locally advanced prostate cancer with 34PE12 by inununohistochemistry. Only 4 cases were 
positive for 34pE12. They concluded 34pE12 is a very useful marker to differentiate prostate 
adenocarcinoma from normal prostatic glands. Another group studied 228 equivocal cases and . 
found 34pE12 of great value in confinning, establishing, or changing diagnoses in questionable 
foci seen in the everyday practice of surgical pathology (Wojno, K.J. and J.I. Epstein. The utility 
of basal cell-specific anti-cytokeratin antibody (34 beta El 2) in the diagnosis of prostate cancer. 
A review.of228 cases/Am J S\xig?a.\hol 1995. 19(3): p. 251-60). 
B72.3 

[0513] This is a monoclonal antibody that recognizes tiunors associated with glycoprotein 72. 
It has been shown to react with various types of adenocarcinoma, including prostate adeno- 
carcinoma. Maziir and associates (Mazur, M.T. and J.J. Shultz, Prostatic adenocarcinoma^ 
Evaluation of immunoreactivity to monoclonal antibody B723. Am J Clin Pathol, 1990. 93(4): p. 
466-70) showed that over 95% of benign prostate disease was negative for B72.3, whereas 77% 
of prostate adenocarcinoma was at least focally positive and 100% well-differentiated prostate 
adenocarcinoma was B72.3 positive. Another study demonstrated that benign epitheliiun.and 
stromal tissue did not inmiuno-stain with B72.3. Immunostaining was detected within the 
nialignant cells in 30% of localized prostate adenocarcinomas (Myers, R.B., et al., Expression of 
tumor-associated glycoprotein 72 in prostatic intraepithelial neoplasia and prostatic 
flrfenocarcmo/wa. Mod Pathol, 1995. 8(3): p. 260-5). 
C-erb-B2 

[0514] This is an oncogene also known as Her2/neu, and its product is called oncoprotein.. C- 
erb-B2 belongs to the epidemal growth factor receptor (EGFR) family that includes c-erb-Bl, c- 
erb-B2, aiid c-erb-B3. Over-expression of c-erb-B2 is well knovm in breast carcinoma. The role 
of c-erb-B2 remains imcertain in the pathogenesis and progression of human prostate cancer. 
Previous studies have reported widely divergent rates of c-erb-B2 expression in primary prostate 
tumors, probably due to significant methodologic differences in the studies. Reese and associates 
(Reese, D.M., et .al., HER2 protein expression and gene amplification in androgen-independeni ' 
prostate cancer. Am J Clin Pathol, 2001. 1 16(2): p. 234-9) studied Her2 protein expression on 
androgen-independent prostate cancer by immunohistochemical and fluorescence in situ 
hybridization (FISH). They discovered that 36% of the specimens were positive for Her2 but . 
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only 6% had gene amplification, indicating the mechanism for Her2 amplification and protein 
expression is different than in breast cancer. Another study diemonstrated c-erbrB2 is a very 
useful biomarker for advanced prostate carcinoma (Arai, Y., T. Yoshiki, and 0. Yoshida, c-erbB- 
2 oncoprotein: a potential biomarker of advanced prostate ccsncer. Prostate, 1997. 30(3): p. 195^ 
201). 

E-Cadherin 

[0515] This is a transmembrane Ca++ dependent cell adhesion molecule. Decreased 
expression of E-Cadherin has been demonstrated in a number of carcinomas and has been 
associated with both a higher tendency to metastasize and a decreased survival rate. Kuniyasu 
and associates (Kuniyasu, H., et al., Relative expression of type IV collagenase, E-cadherin, and 
vascular endothelial growth factor/vascular permeability factor in prostatectomy specimens 
distinguishes organ-confined from, pathologically advanced prostate cancers. Clin Cancer Res, 
2000. 6(6): p. 2295-308) showed that a decreased expression of E-Cadherin was associated with 
an increasing Gleason score. Other studies also revealed E-Cadherin is down-regulated in 
prostatic bone metastases and reduced E-Cadherin is associated with poor prognosis hi patients 
with prostate cancer (Bryden, A.A., et al., E-cadherin and beta-catenin are down-regulated in 
prostatic bone metastases. BJU Int, 2002. 89(4): p. 400-3; Umbas, R., et al., Decreased E- 
cadherin expression is associated with poor prognosis in patients with prostate cancer. Cancer 
Res, 1994. 54(14): p. 3929-33). 
EGFR 

[0516] Epidermal growth factor receptor (EGFR) is a transmembrane glycoprotein. It plays an 
bnportant role in cell growth and differentiation. Epidermal growth factor (EGF), one of the. 
ligands, interacts with cell-surface epidermal growth factor receptors (EGFR) to induce receptor 
tyrosine phosphorylation and activation of the intracellular signal-transduction pathways. EGF 
appears to be die predominant EGF-related growth factor in the nonnal prostate and in benign 
prostatic hyperplasia (BPH). EGFR is located in the basal/neuroendocrihe (NE) compartment of. 
the benign prostate and exhibits relatively androgen-independent expression. EGF-related 
peptides and EGFR are also present in neoplastic prostatic tissues. Androgen-independent cancer 
cells exhibit more EGFR expression and phos-phorylation than do androgen-responsive prostate 
cancer cells (Shenyood, E.R. and C. Lee, Epidermal growth factor-related peptides and the 
epidermal growth factor receptor in normal and malignant prostate. World J Urol, 1995. 13(5): 



160 



wo 2004/025251 



PCT/US2003/028379 



p. 290-6). One study found 22 of 46 prostate adenocarcinomas expressed EGFR (Cohen, D. VV., 
et al., Expression of transforming growth factor-alpha and the epidermal gromh factor receptor 
in human prostate tissues, J Urol, 1994. 152(6 Pt 1): p. 2120-4). 

. Fatty Acid Synthase (FAS) 
[0517] Fatty acid synthase (FAS), or onco-antigen 519 (OA-519), is a key lipogenic enzyme. It • 
has been recently associated with poor prognosis in breast cancers. Shurbaji and associates 
(Shurbaji, M.S., J.H. Kalbfleisch, and T.S. Thurmond, Immunohistochemical detection of a fatty 
acid'synthase (OA-519) as a predictor of progression of prostate cancer. Ham Pathol, 1996. 
27(9): p. 917-21) demonstrated that expression of FAS was in 57% of prostate cancers and FAS 
positive cancers were more likely to progress than FAS negative cancers. In Swiimen's study 
(Swinnen, J.V., et al., Overexpression of fatty. acid synthase is an early and common event in the 
development of prostate cancer. Int J Cancer, 2002. 98(1): p. 19-22), benign hyperplastic 
glandular structures were all negative for FAS staining, immunohistochemical signal was evident 
in 24 of 25 low grade prostatic epithelial neoplasia (PIN) lesions, in 26 of 26 high grade PIN 
lesions and in 82 of 87 invasive carcinomas. Staining intensity tended to increase fiom low 
grade to high grade PIN to invasive carcinoma. Cancers with a high FAS expression had an 
overall high proliferative index. Another study found that FAS is a significant prognostic marker 
for prostate cancer and is one of the few markers that provides additional predictive information 
beyond that of the Gleason score (Epstein, J.L, M. Carmichael, and A.W. Partin,. OAS 19 (fatty 
acid synthase) as an independent predictor of pathologic state in adenocarcinoma of the 
prostate. Urology, 1995.45(1): p. . . . ' 

ID-l 

[0518] The helix-loop-helix protein ID-1 serves to prevent basic helix-loop-helix transcription 
factors from binding to. DNA, thus, inhibiting the transcription of differentiation associated 
genes. Over expression of ID-1 has been reported in certain tumors, such as breast, esophageal, 
pancreatic and medullary thyroid cancers. Ouyang and associates (Ouyang, X.S., et al.. Over . 
expression oflD-l in prostate cancer J Urol, 2002. 167(6); p. 2598-602) documented that 
negative to weak expression of ID-1 in normal prostate or BPH tissue was observed on 
immunohistochemical study and in situ hybridization. In contrast, all prostate cancer biopsies 
showed significant positive ID-1 expression in tumor cells at the messenger RNA and protein 
levels. Furthermore, expression of ID-1 was stronger in poorly differentiated than in well-. 
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differentiated carcinomas, suggesting that the level of ID-1 expression may be associated with 
tumor grade. 

Kallikrein2 

[0519] Human glandular kallikrein 2 (hK2) is a member of a muitigene family of serine 
proteases. It is expressed in prostate epithelium and is under androgen regulation. Kallikxein 2 is 
present in serum and seminal fluid. It can form complexes with endogenous. protease inhibitors 
(e.g., alpha2-macroglobulin and alpha 1-antichymo-trypsin). Studies show the specificity of 
prostate czincer detection increased when kallikrein 2 combined with prostate-specific antigen 
(PSA) (Partin, A.W., et al., Use of human glandular kallikrein 2 for the detection of prostate 
cancer: preliminary analysis. Urology, 1999. 54(5): p. 839-45 ). Another study showed that the 
combination of human glandular kallikrein and free prostate-specific antigen (PSA) enhances 
discrimination between prostate cancer and benign prostatic hyperplasia in patients with a 
moderately increased total PSA (Magklara, A., et aL, The combination of human glandular 
kallikrein and free prostate-specific antigen (PSA) enhances discrimination between prostate 
cancer and benign prostatic hyperplasia in patients with moderately increased total PSA, Clin 
Chem, 1999. 45(11): p. 1960-6). Nam and associates (Nam, R.K.., et al.. Serum human glandidar 
kallikrein-l protease levels predict the presence of prostate cancer among men with elevated 
prostate-specific antigen, j Clin Oncol, 2000. 18(5): p. 1036-42) documented that serum human 
glandular kallikrein 2 levels predict the presence of prostate cancer among men with elevated 
PSA. 

Ki-67 

[0520] • This is a nucldarprotein that is expressed in proliferating normal and-neoplastic ceils. 
Ki-67 expression occurs during the phase of the cell cycle designated as late Gl , S, M,. and G2, 
but not in GO phase (Cattoretti, G., et bI., Monoclonal antibodies against recombinant parts of 
the Ki'67 antigen (MIB 1 and MIB 3) detect proliferating cells in microwave-processed formalin- 
fixed paraffin sections, J Pathol, 1992. 168(4): p. 357-63). Ki-67 is commonly used as a cell 
proliferation marker and it is located in the nucleus. Studies show that Ki-67 staining in prostate 
cancer provides independent prognostic information after radical prostatectomy and Ki-67 
immunoreactivity is a predictor for prostate cancer survival (Halvorsen, O. J., et al, Maximum Ki- 
67 staining in prostate cancer provides independent prognostic information after radical 
prostatectomy. Anticancer Res, 2001. 21(6A): p. 407 1-6; Stattin, P., et al., Cell proliferation 
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assessed by Ki'67. immiinoreactivity on formalin fixed tissues is a predictive factor for survival in 
prostate cancer. J Urol, 1997. 157(1): p. 219-22). 

Leu 7 . 
[0521) This antigen is present on non-cancerous and cancerous prostatic epitheiia, as well as 
natural Idller cells and myelinated nerves. Liu and associates (Liu, X., et al, 
Immunohistochemical study ofHNK-1 (Leu-?) antigen in prostate cancer and its clinical 
significance. Chin Med J (Engl), 1995. 108(7): p. 516-21) showed that 94% of prostate cancer is 
positive for Leu 7. Well-differentiated cancer showed the highest percentage of positive cancer 
cells and the strongest staining, while poorly differentiated cancer had the lowest percentage of 
positive cancer cells and the weakest staining. Another study suggested that die expression of 
Leu 7 on prostate cancer may be a.useful prognostic factor for patients with prostate cancer (Liu, 
X.H., et al., The prognostic value of the HNK-l (Leu-?) antigen in prostatic cancer--an 
inin^iinohistochemical study, HinyokikdiKiyq^ 1993. 39(5): p. 439-44). 

MDM2 

[0522] This is a cellular proto-oncogene product. MDM2 binds to p53, promoting degradation 
via ubiquitin, masking its transactivation domain, and inhibiting its transcriptional activation of 
genes related to cell cycle arrest and apoptosis (Momand, J. and G.P. Zambetti, Mdm-l: "big 
brother" ofp53, J Cell Biochem, 1997. 64(3): p. 343-52 ). Amplification of the MDM2 gene or 
over-expression of the MDM2 protein have been implicated in the development of tumors, and 
MDM2 over-expression has been related to more aggressive disease and poorer survival 
(Freedman, D. A., L. Wu, and AJ. Levine, Functions of the MDM2 oncoprotein.. Cell Mol Life 
Sci, 1999. 55(1): p. 96-107). Leite and associates (Leite, K.R., et al, Abnormal expression, of 
MDM2 in prostate carcinoma. Mod Pathol, 2001. 14(5): p. 428-36) showed MDM2 was over- 
expressed in 41% of prostate adenocarcinomas cases. Tumors that were positive for both p53 
and MDM2 were larger and of more advanced stage. Results suggest that MDM2-positive/p53- 
positive phenotype identifies prostate cancers with aggressive behavior. 
N-Cadherin 

[0523] Similar to E-Cadherin, this is a cell adhesion molecule. Changes in cell-cell uiteractions 
are critical in the process of cancer progression. Likewise, it has been shown that loss of 
expression of the cell adhesion molecule E-cadherin is associated with grade, stage, and 
prognosis in many carcinomas, including prostate cancer. Impaired E-cadherin-mediated 
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interactions result in an invasive phenotype. However, the mere loss of cell-cell contact and 
commLinication is not thjs sole explanation for the observed correlation between.loss of E- 
cadherin-mediated adhesion and poor clinical outcome (Kuniyasu, H., et al, Relative expression 
of type IV collagenase, E-cadherin, and vascular endothelial growth factor/vascular permeability 
factor in prostatectomy specimens distinguishes organ-confined from pathologically advanced 
prostate cancers. Clin Cancer Res, 2000. 6(6): p, 2295-308;( Bryden, A.A., et al., E-cadherin 
and' beta-catenin are down-regidated in prostatic bone metastases, BJUInt, 2002. 89(4): p. 400- 
3; Umbas, R., et Decreased E-cadherin expression is associated with poor prognosis in 
patients with prostate cancer. Cancer Res, 1994. 54(14): p. 3929-33). Bussemakers and 
associates (Bussemakers, MJ., et al, Complex cadherin expression in human prostate cancer 
cells, Int J Cancer, 2000. 85(3): p, 446-50) .demonstrated that N-Cadherin is up-regulated in 
human prostate cancer cell lines. Another study showed that N-cadherin was not expressed in 
normal prostate tissue, however, in prostatic cancer N-cadherin was found to be expressed in the 
poorly differentiated areas, which showed mainly aberrant or negative E-cadherin staining 
(Tomita, K., et al., Cadherin switching in human prostate cancer progression, Csmcer Res, 2000. 
60(13): p. 3650-4). 

P504S 

[0524] Alpha-methylacyl-CoA racemase (P504S) is a cytoplasmic protein. It was recently 
identified by cDNA library subtraction in conjunction with high throughput microairay screening' 
from prostate carcinoma. Jiang and associates (Jiang, Z., et al, P504S: a new molecular marker 
for the detection of prostate carcinoma. Am J Surg Pathol, 2001. 25(11): p. 1397-404) examined 
P504S by immunocytochemistry on benign and cancerous prostate tissues. P504S showed strong • 
cytoplasmic granular staining in 100% of the prostate carcinomas regardless of Gleason scores 
and diffiise. In contrast, 171 of 194 (88%) of benign prostates, including 56 of 67 (84%) benign 
prostate cases and 115 of 127 (91%) cases of benign glands adjacent to cancers, were negative 
for P504S. Another study revealed P504S is a very useful marker for distinguishing the atypical 
adenomatous hyperplasia of the prostate from prostatic adenocarcinoma (Yang, X.J., et al., 
Expression of alpha-MethylacyhCoA racemase (P504S) in atypical adenomatous hyperplasia of 
rAe j^roj^flre. Am J Surg Pathol, 2002. 26(7): p. 921-5). 



164 



wo 2004/025251 



PCT/US2003/028379 



p53 

[0525] This is a tumor suppressor gene. Inactivation of p53 is implicated in tiimorigenesis for 
over half of all human cancers. It functions as a transcriptional regulator involved in Gi phase 
.growth arrest of cells in response to DNA damage, as well as having a role in the regulation of the 
spindle checkpoint, centrosome homeostasis, and G2-M phase transition. It also induces 
apoptosis by transcription-dependent and independent mechanisms in many cell types and 
regulates tumor angiogenesis (Kitsch, D.G. and M.B. Kastan, Tumor-suppressor p53: 
implications for turnor.developrnent and prognosis,! Clin Oncol, 1998. 16(9): p. 3158-68; 
Agarwal, ML,, et al, The p5 3 network. J Biol Chem, 1998. 273(1): p. 1-4; Liebermann, D.A., B. 
Hoffman, and R.A. Steinman, Molecular. controls of growth arrest and apoptosis: pSS-dependent 
and independent pathways. Oncogene, 1995. 11(1): p. 199-210). Thompson and associates 
(Tliompson, S J., et al., P53 andKi-67 immunoreactivity in human prostate cancer and benign 
hyperplasia. Br J Urol, 1992! 69(6): p. 609-13) showed prostate cancer specimens with p53 were 
stained, whereas no staining was observed in benign prostate hyperplasia (BPH). Another study " 
showed that nuclear accumulation of p53 was a significant prognostic indicator for prostate 
cancer (Quinn, D.I., et ah, Prognostic significance ofp53 nuclear accumulation in localized 
prostate cancer treated with radical prostatectomy. Cancer Res, 2000. 60(6): p, 1585-94). 

Prostate Acid Phosphatase (PAP) 
[0526] Prostate acid phosphatase (PAP), as prostate specific antigen (PSA), has greatly 
increased the feasibility of reliable diagnosis of primary or metastatic prostatic carcinoma. 
Studies show that diagnostic sensitivity of PAP is 90-100% and its specificity is 87-100%, and . 
. PAP is a useful prognostic indicator of advanced prostatic carcinoma (Svaiiholm, H. and M. 
Horder, Clinical application of prostatic markers. L Classification of prostatic tumours using 
immunohistochemical techniques. Scand J Urol Nephrol Suppl, 1988. 107: p. 65-70; (Sakai, H., 
et al., Prostate specific antigen and prostatic acid phosphatase immunoreactivity as prognostic 
indicators of advanced prostatic carcinoma. J Urol, 1993. 149(5): p. 1020-3). 

Prostate Inhibin Peptide (PIP.) 
[0527] Prostate inhibin peptide (PEP) is a polypeptide synthesized by the prostate gland. It is 
involved in prostatic growth and differentiation. The PIP gene is localized in. the 7q34 region . 
that contains .a number of fragile sites. Rearrangement of PIP genes was found in prostate 
carcinomas (Autiero, M., et al, Abnormal restriction pattern of PIP gene associated with human 
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primary prostate cancers, DKA Cell Biol, 1999. 18(6): p. 481-7). Garde and associates (Garde, 
S. v., et al., Prostate inhibin peptide (PIP) in prostate cancer: a comparative 
immunohistochemical study with prostate-specific antigen (PSA} and prostatic acid phosphatase 
(PAP), Cancer Lett, 1994. 78(1-3): p. 1 1-7) showed PIP is as sensitive and specific of an " 
immunohistochemical marker as PSA and PAP in the diagnosis of prostate carcinoma. Further, 
the androgen-independent nature of PIP may give it an advantage over PSA/PAP in tumors 
exposed to androgen ablating agents. 

Prostate Specific Antigen (PSA) 
' [0528] Prostate specific antigen (PSA) is aproduct of prostatic epithelium and is normally 
secreted in the semen. It is a serine protease whose Hmction is to cleave and liquefy the seminal 
coagulura formed after ejaculation. PSA is organ specific, not cancer specific. Thus, elevations 
in PSA levels occur not only in cancer, but also in non-neoplastic conditions, such as nodular 
hyperplasia'of the prostate and prostatitis. PSA by itself cannot be used for detection of early 
cancer. When combined with a rectal examination and transrectal ultrasonography, however, 
measurement of PSA levels is considered useful in detection of early-stage cancers (Crawford, 
J.M. and R.S. Cotran, The Male Genital Tract, in Robbins Pathologic Basis of Disease, R.S. 
Cotran, V. Kuman, and T. Collins, Editors, 1999, W. B. Saunders Company: Philadelphia, p. 
.101 1-1034). More evidence suggests that free serum PSA is more accurate than total PSA in the 
diagnosis of prostate carcinoma (Catalona, W.J., et al.. Use of the percentage of free prostate- 
specific antigen to enhance differentiation of prostate cancer from benign prostatic disease: a 
prospective multicenter clinical trial. Jama, 1998. 279(19): p. 1542-7). Irmnundhistochemical 
study of PSA for prostate carcinoma diagnosis is also very sensitive and specific. Svanholm. and 
associates (Svanholm, H. and M. Horder, Clinical application of prostatic markers, L 
Classification of prostatic tumours using immunohistochemical techniques. Scand J Urol Nephrol 
Suppl, 1988. 107: p. 65-70) demonstrated the sensitivity of PSA for prostate carcinoma diagnosis 
is 94-100% and its specificity is 100%. 

V. Breast Cancer 
[0529] Non-invasive mammography as a screening tool for breast cancer is not effective 
(Gotzsche, P.C. and O. Olsen, Is screening for breast cancer with mammography justifiable? 
Lancet, 2000. 355(9198): p. 129-34). Therefore, other techniques for screening for breast cancer 
have been studied. A variety of breast cancer markers have been discovered to aid physicians in 
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making timely, precise diagnoses, and to provide significantly better patient management. 
Unfortunately, none of these tumor makers is a "magic bullet" with both high sensitivity and 
specificity. Therefore, alternative ways to enhance diagnostic accuracy are necessary. 
[0530] An alternative way to enhance diagnostic accuracy is to develop a panel comprising a 
plurality of probes each of which specifically binds a marker associated with breast cancer. All 
candidate probes are;to be tested with ICC and/or IHC techniques. In some embodiments, 
specimens may be obtained from fme needle aspirations (FNA). In other embodiments, 
specimens may be obtained from fine needle biopsies (FNB). Proper location and sampling of 
tumors by FNA and FNB procedures may be aided by ultrasound and other image-guiding 
techniques. In some embodiments, test cells are obtained from breast ductal lavage. Devices for 
this purpose are readily available and may work by collecting cells through an aspirator similar to 
a manual breast pump. A small suction cup is used to draw nipple aspirate fluid (NAF) through 
the nipple. The presence of NAF helps locate natural openings of the ducts on the surface of the • 
nipple. Then a tiny, flexible microcatheter is inserted approximately half an inch into the duct to 
• be lavaged in order to collect cells lining the breast duct. 
[0531] Once the specimens are collected, the specimens will be processed and analyzed. 
Statistical analysis will be used to design panels, as described above for limg cancer. During 
processing, technical issues such as cell smears or pellets not sticking to slides during harsh 
washings niay occur in some embodiments. However, such issues can readily.be addressed by 
manipulation of software or modifying staining protocols to mitigate such problems. In some 
embodiments, the specimens will be processed and analyzed using a device that automatically 
samples the specimen and prepares slides for diagnosis. It. is anticipated that a broad menu of 
probes will be used initially. The number of probes will be pruned to a suitably sized panel in 
order to retain a high level of sensitivity and specificity. Selection of the final probes will be 
based on a pre-defined tlueshold of the percentage of positive stained tumor cells. Sophisticated 
statistical analysis will be employed to make these determinations. Since the panel-assay 
approach to detecting mahgnancies is applicable to solid tumors, and several of the same tumor 
markers are in different panels, this method may be caixied out in parallel, as well as serially. In 
tliis manner, the assay development process can be expedited. 
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Library of Probes/Markers 

[0532] Various sources containing information about cancer markers were reviewed. An 
arbitrary criterion of 20% or Greater positivity of breast cancer was used to select probes for a 
preferred panel for detection and/or diagnosis of breast cancer. The- term "20% or greater 
positivity" means that if 100 tumor cases were studied, 20 or more of these cases, would have 
shown a presence of the individual marker, while the remaining 80 cases would not have shown a 
presence of the individual nlarker. A preferred panel may include molecular markers selected 
from AE1/AE3, BCA-225, Bcl-2, BRCA-1. Cancer Antigen 153 (CA 15.3), CathespinD, 
Carcinoembryonic Antigen (CEA), C-erb-B2, E-Cadherin, Epidermal Growth Factor Receptor 
(EGFR), Estrogen receptor (ER), Gross Cystic Disease Fluid Protein 15 (GCDFP-15), H0X-B3, 
Ki-67, iMUC-l, p53, p65, Progesterone Receptor (PR), Retinoblastoma (Rb) and 
Transglutaminase K (TGK). A brief description of the library of probes/markers utilized hi the 
present example is provided below. 
AE.1/AE3 

[0533] This is a cocktail of anti-keratin antibodies, Keratms are a group of water-insoluble 
proteins that form an intermediate filament in the cells of epithelial origin, such as breast ductal 
epithelium. Anti-keratin AEl recognizes the 56 and 40 kD keratins of the acidic sub-family. 
Anti-keratin AE3 recognizes the basic sub-family. AEl and AE3 have been shown to be more ■ 
effective than other anti-cytokeratins in identification of lymph node metastasis from breast 
ductal carcinoma (Elson, C.E., D. Kufe, and W.W. Johnston, Immiinohistochemical detection and 
significance of axillary lymph node micrometastases in breast carcinoma. A study of 97 cases. 
Anal Quant Cytol Histol, 1993. 15(3): p. 171-8). Kowolik and associates (Kowolik, J.H., et al.. 
Detection of micrometastases in sentinel lymph nodes of the breast applying monoclonal 
antibodies AEl /AE3- to pancytokeratins. Oncol Rep, 2000. 7(4): p. 745-9) showed 32 of 33 cases 
of axillary lymph nodes with breast cancer metastasis that were correctly predicted by AE1/AE3 
immunohistochemical staining. AE1/AE3 hias also been shown to be the most sensitive marker 
for detecting occult metastasis in nodes of infiltrating lobular breast carcinoma (Kainz, C, et al., 
Infiltrating^ lobular breast carcinoma: detection of occult regional lymph node metastasis by 
immunohistochemistry. Anticancer Res, 1993. 13(1): p. 73-4). 
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BCA-225 

[0534] This antibody recognizes a human breast carcinoma associated glycoprotein. BCA-225 
(220-225 kD), which differs in size and distribution from other breast carcinoma antigens. 
However, .unhke other carcinoma antibodies against breast carcinoma antigens, BCA-225 does 
not react with benign or malignant gastrointestinal tissues. One study showed 94% of breast 
carcinoma are positive for BCA-225 (Mesa-Tejada, R., et al, Immunocytochemical distribution 
of a breast carcinoma associated glycoprotein identified by monoclonal antibodies. Am J Pathol, 
1988. 130(2): p. 305-14), while another showed 78% of effusion containing breast carcinoma 
was positive and all benign effusions were negative for BCA-225 by immunocytochemistry. 
BCA-225 is highly specific discriminator and very usefiil in differential diagnosis of 
adenocarcinoma and reactive niespthelial cells (Loy, T.S„ AA. Diaz-Arias, and J,T. Bickel, 
Vahie of BCA-225 in the cytologic diagnosis of malignant effusions: an immunocytochemical 
0/797 caje^. Mod Pathol, 1990. 3(3): p. 294-7). 
Bcl-2 

[0535] Survival threshold for a cell is "determined by the balance between cell-death suppressor • 
and cell-death promoter signals provided by external factors or stimuli, as well as by intracellular 
molecules, Bcl-2 has a central role in this determination and its product acts as an anti-apoptotic 
molecule (Daidone, M.G., et al., Clinical studies of Bcl'2 and treatment benefit in breast cancer 
patients. Endocr Relat Cancer, 1999. 6(1): p. 61-8). The Bcl-2 protein has been shown to 
contribute to oncogenesis because it can transform and immortalize cells in cooperation with c- 
myc, ras, or viral genes (Del Bufalo, D., et al, Bcl'2 overexpression enhances the metastatic 
potential of a human breast cancer line. Faseb J, 1997. 11(12): p. 947-53). Bcl-2 expression is 
most commonly associated with the t (14; 18) translocation in most follicular lymphomas. More 
recently, Bcl-2 has been identified in non-hemato logic maUgnancies. Alsabeh and associates 
determined that Bcl-2 is a useful marker in distinguishing metastatic breast carcinoma from 
primary lung and gastric carcinomas, and it is a usefiil prognostic indicator as well (Alsabeh, R., 
et al. Expression of bcl-2 by breast cancer: a possible diagnostic application. Mod Pathol, 1996. 
9(4): p. 439-44). 

BRCA-1 

[0536] This is a tumor suppressor gene located on the long arm of chromosome 17. Tumor 
suppressor genes play, a critical role in regulating cell growth. BRCA-1 is a nuclear. 
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phosphoprotein, which aormally functions' as a negative regiUator of the cell cycle and may be an 
active inhibitor of neoplastic progression. Mutation of the BRCAl gene has been demonstrated 
in 80% of familial breast cancer. Decreased mKNA levels or aberrant sub-cellular locations of 
BRCAl have been identified in breast cancer lines and ih sporadic cases of breast cancer tissues. 
BRCAl mutations are linked to ovarian cancer as well (Lee, W.Y., et al., Immunolocalization of 
BRCAl protein in normal breast tissue and sporadic invasive ductal carcinomas: a correlation 
with other biological parameters. Histopathology, 1999. 34(2): p. 106-12). Jarvis and associates 
(Jarvis, E.M., J.A. Kirk, and C.L. Clarke, Loss of nuclear BRCAl expression in breast cancers is 
associated with a highly proliferative tumor phenotype. Cancer Genet Cytogenet, 1998. 101(2): 
p. 109-15) found that nuclear staining for BRCA-1 was observed in most sporadic tumors, but 
nuclear BRCA-1 was reduced or absent in the majority of familial and early onset breast tumors. 
A significant inverse correlation was found between nuclear BRCA-1 and expression of the 
proliferation marker Ki-67. Another study revealed BRCA-1 expression was correlated with 
other prognostic markers including p53, c-erb-B2, Bcl-2, ER, histological grade, tumor size, 
axillary lymph node status and age (Lee, W.Y., et al., Immunolocalization of BRCAl protein in 
normal breast tissue and sporadic invasive ductal carcinomas: a correlation with other 
biological parameters. Histopathology, 1999. 34(2): p. 106-12). 

Cancer Antigen 15.3 (C A- 15. 3) 
[0537] Cancer antigen 15.3 is a serimi carbohydrate antigen. Increased serum concentration of 
CA-15.3 has been associated with breast carcinoma as well as nonnal and benign breast disease. 
However, the level of CA-153 is significantly higher in breast carcinoma than that in either 
nonnal or benign breast disease (Barak, M., et al., CA'15,3, TPA and MCA as markers for breast 
cancer. Eur J Cancer, 1990. 26(5): p. 577-80). xMartoni and associates (Martoni, A., et al., CEA, 
MCA, CA ,15.3 and CA 549 and their combinations in expressing and monitoring metastatic 
breast cancer: a prospective comparative study, Eur J Cancer, 1995. 31 A(10): p. 1615-21) have 
documented thjat CA 15.3 has higher sensitivity than other tumor markers in detecting metastatic, 
breast cancer. 

CathepsinD 

[0S38] This is a soluble lysosomal aspartic proteinase. It is syrithesized in the endoplasmic 
reticulum as a preprocathepsinD. Having a mannose-6-phosphate tag, procathepsin D is 
recognized by a mannose-6-phosphate receptor. Upon entering into an acidic lysosome, single- 
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chain procathepsin.D (52 kD) is activated to cathepsin D. The "flindamental role of cathepsin D is 
the. degradation of intracellular and internalized proteins. Increased levels. of cathepsin D (both at 
the mRNA and protein levels) were first reported in several human neoplastic tissues in the" mid- 
eighties. These findings generated intense research in a possible role for cathepsin D in ' ^ 
neoplastic processes. A strong predictive value was found for cathepsin D concentrations in 
breast cancer, as well as many other tumor types (Vetvicka, V., et dl.y Analysis of the interaction 
of procathepsin D activation peptide with breast cancer cells. Int J Cancer, 1997. 73(3): p. 403-9; 
Vetvicka, V., J. Vetvickova, and M. Fusek, Effect of procathepsin D. and its activation peptide on 
prostate cancer cells. Cancer Lett, 1998. 129(1): p. 55-9). Niu and associates (Niu, Y., et al.. 
Potential markers predicting distant metastasis in axillary node-negative breast carcinoma, Int J 
Cancer, 2002. 98(5): p. 754-60) fourid Cathepsin D to be a potential marker predicting distant 
metastasis in axillary node-negative breast cancer patients. . 

Carcinoembryonic Antigen (CEA) 
[0539] Carcinoembryonic antigen (CEA) is well known for its role in diagnosis and follow-up 
for colorectal cancer. CEA positivity of breast carcinoma is also reported. Alexiev (Alexiev, 
B,A., et al., Immimocytochemical detection of carcinoembryonic antigen in fine-needle aspirates 
from patients with diverse breast diseases, Diagn Cytopathol, 1993. 9(4): p. 377-82) and 
associates reported that 90% of primary breast carcinomas showed positive cytoplasmic staining 
for CEA whereas none of the benign breast diseases are positive on fine-needle aspirates and 
paraffin-embedded tissue sections. Others have reported CEA was positive for fibroadenoma 
and cystic change disease ranging from 25% to 64% (Wittekind, C, S. Von Kleist, and W. 
Sandritter, CEA positivity in tissue and sera of patients with benign breast lesions, Oncodev Biol 
Med, 1981. 2(6): p. 381-90). 

C-erb-B2 

[054.0] This is an oncogene and its product is known as oncoprotein. C-erb-B2, also known as 
Her2/neu, belongs to the epidermal growth factor receptor (EGFR) family that includes c-erb-Bl, 
c-erb-B2, and c-erb-B3. Over-expression of c-etb-B2 i^ associated with a high percentage of 
human carcinomas arising within the breast, ovary, lung, prostate, stomach and salivary glands. 
It is suggested that tumors that over-express the growth factors, such as c-erb-B2, would be 
. exquisitely sisnsitive to the growth-promoting effects of a small amount of growth factors and 
hence likely to be more aggressive. This hypothesis is supported by the observation that high 



171 



wo 2004/025251 



PCT/US2003/028379 



. levels of c-erb-B2 protein on breast cancer cells are a harbinger of poor prognosis (Mitchell, R.N. 
and R.S, Cotran, Neoplasia, in Robbins Pathologic Basis of Disease, R.S. Cotran, V.K. Kumar, 
andT. Collins, Editors. 1999, W. B. Sanders Company: Philadelphia, p. 260-327). .Inaji and 
associates (Inaji, H., et al., ErbB'2 protein levels in nipple discharge: role in diagnosis of early , 
breast cancer. TLimour Biol, 1993. 14(5): p. 271-8) found that levels of c-erb-B2 oncoprotein in 
nipple discharge were elevated in breast carcinoma patients. One study revealed that 34% of fine 
needle aspirations of breast carcinoma and all but one corresponding tissue section were positive 
for c-erb-B2 oncoprotein (Jorda, M., P. Ganjei, and M. Nadji, Retrospective c-erbB-l 
immiinostaining in aspiration cytology of breast cancer. Diagn Cytopathol, 19.94. 11(3): p. 262- 
5). ■ 

E-Cadherin 

[0541] This protein is suggested to be the major cell adhesion molecule in mammary glands. In 
cytoplasm, E-Cadherin is linked to alpha- and beta-catenin which mediates the coraiection of the 
cytoskeleton. In addition, c-erbB-2 oncoprotein causes disruption of the cell adhesion system 
through beta-catenin phosphorylation (Nagae, Y., et al., Expression ofE-cadherin catenin and C- 
erbS'2 gene products in invasive ductaUype breast carcinomas^ J Nippon Med Sch, 2002. 69(2): 
p. 165-71). Decreased expression of E-Cadherin is found in breast carcinoma (Mitchell, R.N. 
and R.S. Cotran, Neoplasia, in Robbins Pathologic Basis of Disease, R.S. Cotran, V.K. Kumar, 
and T, Collins, Editors. 1999, W. B. Sanders Company: Philadelphia, p. 260-327). . 

Epidermal Growth Factor Receptor (EGFR) 
[0542] Epidermal growth factor receptor (EGFR) is a transmembrane glycoprotein. It plays an 
important role in cell growth and differentiation. EGFR expression has been shown in a broad 
spectrum of normal tissues, whereas over-expression has been associated with a variety of 
neoplasms. Sue and associates (Suo, Z., et al, Type 1 protein tyrosine kinases in benign and 
malignant breast lesions, Histopathology, 1998. 33(6): p. 514-21) examined the expression 
pattern of the four EGFR family members in breast tumor tissues and found 53% of breast tumor 
tissues were strongly positive for EGFR, though benign tumors also expressed EGFR protein but 
all at a lower, moderate level. An association between EGFR expression and.increasiag 
malignancy grade was found in a group of infiltrating ductal carcinomas. . 
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Estrogen Receptor (ER) 
[0S43J Estrogens control a variety of physiological and disease-linked processes, most notably 
reproduction, bone remodeling and breast cancer, and their effects are transduced through classic 
receptors referred to as estrogen receptor (ER). This monoclonal antibody reacts with the N- 
terminal domain (A/B region) of the 67 IcD polypeptide chain of the estrogen receptor, and 
exhibits a nuclear staining pattern with little or no cytoplasmic reactivity. In tumor tissue, ER 
reacts strongly with epithelial cells of breast cancers and human neoplasms derived from other 
estrogen-dependent tissues. In general, cancers that have cells expressing ER in their nuclei will 
have better prognoses because such positive neoplastic cells are better differentiated and can 
respond to hormonal manipulation. Tamoxifen, a drug, is often utilized for this purpose (Barnes, 
D.M., et al., Immunohistochemical determination of oestrogen receptor: comparison of different 
methods of assessment of staining and correlation with clinical outcome of breast cancer 
patients, Br J Cancer, 1996. 74(9): p. 1445-5 1; Pichon, M.F., et al, Prognostic value of steroid , 
receptors after long-term follow-up of 2237 operable breast cancers. Br J Cancer, 1996. 73(12): 
p. 1545-51), 

Gross Cystic Disease Fluid Protein 15 (GCDFP-15) 
[0544] Gross Cystic Disease Fluid Protein 15 (GCDFP-15) is a glycoprotein (15 kD) expressed 
by apocrine sweat glands, eccrine glands, minor salivary glands, bronchial glands and 
: metaplastic epithehum of the breast. Breast carcinomas (primai7 and metastatic lesions) with 
apocrine features express the GCDFP-15 antigen. GCDFP-15 is positive in extra-manraiary 
Paget's disease, while other tumors tested negative. Niuneroas histopathologic studies have 
shown GCFDP-15 to be a specific markerfor breast cancer in surgicalTspecimens. Fiel and 
associates (Fiel, M.L, et al., Value of GCDFP-15 (BRST-2) as a specific immunocytochemical 
marker for breast carcinoma in cytologic specimens. Acta Cytol, 1996. 40(4): p. 637-41) also 
demonstrated that GCFDP-15 is a specific immunocytochemical marker for breast carcinoma in 
cytologic specimens.. 

H0X-B3 

[0S45] The homeobox (HOX) genes encode proteins which contain 6 1 amino acid DNA- 
binding homeodomain and are involved in the transcriptional regulation of other genes during 
normal pnco- and histogenesis. Class I HOX genes are organized into four clusters, on differetit 
chromosomes in humans, with a high conservation in the order of the genes, within each of these 
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clusters. Re-expression of HOX gene products has been reported in a wide variety of 
neoplastically transformed cells and it seems quite likely that the HOX genes represent yet 
another class of oncofetal antigens involved in normal development and carcinogenesis, as well 
as tumor progression. HOX-3 is one of the HOX gene products (HOX-B .3, -B4, and -C6). One" 
study showed over 90% of breast carcinoma is positive for H0X-B3 (Bodey, B., et al, 
Immiinocytochemical detection of the homeobox B3, B4, and C6 gene products in breast 
carcinomas. Anticancer Res, 2000. 20(5 A): p. 328 1-6). 
Ki-67 

[0546] This is a nuclear protein that is expressed in proliferating normal and neoplastic cells. 
Ki-67 expression occurs during the phase of the cell cycle designated as late Gl, S, M, and G2, 
However during the GO phase, the antigen cannot be detected. Studies have shown that 
expression of Ki-67 was inversely associated with estrogen and progesterone receptors, 
suggesting that a high Ki-67 level seems to characterize a more aggressive phenotype and poor 
prognosis (Ceccarelli, C, et al. Quantitative p2 1 (wafl)/p5 3 immunohistochemical analysis 
defines groups of primary invasive breast carcinomas with different prognostic indicators, Int J 
Cancer, 2001. 95(2): p. 128-34; Ceccarelli, C, et al., Retinoblastoma (RBI) gene product 
expression in breast carcinoma. Correlation with Ki'67 growth fraction and biopathological 
profile, J ClinPathol, 1998. 51(11): p. 818-24). 
MUC-1 

[0547] Epithelial mucins are glycoproteins secreted by epithelial cells and their carcinomas. At 
least nine mucin genes have been identified, and their products, MUCl thru MUC9, are 
expressed in. various epithelia. MUCl is a mucin expressed in breast epithelial cells, whereas 
MUC2 and MUC3 are primarily intestinal mucins. MUC-1 antigen is a cell surface glycoprotein. 
This antigen is abundant in 90% of human breast cancers in forms not present in normal tissue 
(Diaz, L.K., E.L. Wiley, and.M. Morrow, Expression of epithelial mucins Mud, Muc2, and 
Muc3 in ductal carcinoma in situ of the breast. Breast J, 2001. 7(1): p. 40-5). One study (Croce, 
M.V„ et al., Expression of tumour associated antigens- in normal benign and malignant human 
mammary epithelial tissue: a comparative immunohistochemical study. Anticancer Res, 1997, 
17(6D): p. 4287-92) showed benign breast tissues expressed a lo.w intensity of MUC-1, restricted 
to apical cell surface membranes and lumen debris. 
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p53 

[0548] This is tumor suppressor gene. High concentrations of the p53 protein occur in a large 
number of tumors and tumor cell lines, while it is present in only minute amounts in normal cells 
and tissues. The increased expression in tumor cells may be caused by mutation of the p53 . 
protein or by complexing with other proteins. The gene for p53 is located on chromosome 17p. a 
frequent site of allele loss in tumors of the breast, lung, colon, ovaries, testicles, bladder, brain, 
melanomas, certain types of leukemia and neurofibrosarcoma. Tumor cell expression of p53. 
protein can be detected by immuno-histochemistry, exhibiting a nuclear staining pattern. 
Expression of the p53 oncoprotein has been shown to correlate with poor prognosis in breast 
cancer (Lee, W.Y., et al., Immunoloccilization of BRCAl protein in normal breast tissue and 
sporadic invasive ductal carcinomas: a. correlation with other biological parameters, 
Histopathology, 1999. 34(2): p. 106-12; MiduUa, C, et al., Immunohistochemical expression of 
p53, nm23-HI. Ki67 and DMA ploidy: correlation with lymph node status and other clinical 
pathologic parameters in breast cancer. Anticancer Res, 1999. 19(5B): p. 4033-7). 
p65 

[0549] This 65 kD oncofetal protein has been identified as a new member of the steroid/thyroid 
super-family of genes, with as yet an unknown ligand. It is suggested that this receptor may play 
. an important role in the development of tumors. The altered form of p65 is linked to the 
overproduction of certain hormones that may cause breast cancers (Hanausek, M., et al., The 
oncofetal protein p65: a new member of the steroid/thyroid receptor superfamily. Cancer Detect 
Prev, 1996. 20(2): p. 94-102). Mirowski and associates (Mirowski, M., et al., Serological. and 
immimohistochemical detection of a 65-kDa oncofetal protein in breast cancer, Eur J Cancer, 
1994, 30A(8): p. 1 108-13) revealed that p65 was positive. in 90% of sera from breast cancer 
patients and positive in 80% of corresponding biopsied tissue assessed by 
immunohistochemistry. Results indicate p65 may be a potential serum and/or immimo- 
histochemical marker for breast cancer. 

Progesterone Receptor (PR) 
[0530] The monoclonal antibody to progesterone receptor (PR) exhibits a nuclear staining 
pattern. In tumor tissue, PR expression is strongly in epithelial cells of breast cancers and human 
neoplasms derived from other progesterone-dependent tissues, while m normal tissues it is 
positive in mammary glands and the uterus. The significance of PR positivity in a breast 
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carcinoma is less well understood. In general, cancers that are ER positive will also be PR 
positive. However, carcinomas that are PR positive, but not ER positive, may have a worse 
prognosis (Blanco, G., et al., Estrogen and progesterone receptors in breast cancer: 
.relationships to tumour histopathology and survi\fal of patients. Anticancer Res, 1984. 4(6): p. 
383-9). One study showed PR was positive in all cutaneous metastatic breast tumors whereas 
only one tumor was positive for ER (Wallace, M.L. and B.R. SmoUer, Differential sensitivity of 
estrogen/progesterone receptors and BRST-2 markers in metastatic ductal and lobular breast 
carcinoma to the skin. Am J Dermatopathol, 1996. 18(3): p. 241-7). 
Retinoblastoma (Rb) 

[0551] This is one of the tumor suppressor genes. Retinoblastoma protein (pRb) is a protein 
that is encoded by the retinoblastoma genevand fimctions to regulate the cell cycle at GO/Gl. 
Loss of Rb fiinction leads to uncontrolled cell growth. Inactivation of the retinoblastoma gene is 
documented in various types of cancer, including breast cancer. Retinoblastoma gene under- 
expression promotes breast-tumor aggressiveness and rapid tiunor-cell proliferation (Bieche, 1. 
and R. Lidereau, Loss of heterozygosity at 13ql4 correlates with RBI gene underexpression in 
human breast cancer. Mol Carcinog, 2000. 29(3): p. 151-8). Ceccarelli and associates 
(Ceccarellii C, et al.. Retinoblastoma (RBI) gene product expression in breast carcinoma. 
•Correlation with Ki-67 growth fraction and biopathological profile: J Clin Pathol, 1998. 51(1 1): 
p. 8 1 8-24) studied pRb expression and tumor markers Ki-67, ER/PR, p53 and EGFR in invasive 
brejtst carcinoma and found that pRb expression paralleled proliferative activity in a majority of 
breast carcinomas examined, suggesting that in these cases the protein behaves nomially in 
regulating the cell cycle. Conversely, in cases with a loss of pRb uhmunostaimtig,' the combined 
expression of specific highly aggressive factors, such as EGFR and p53 expression, estrogen 
receptor/progesterone receptor negative status and high K67, seems to characterize a more 
aggressive phenotype. 

Transglutaminase K (TGK) : 
[0552] Transglutaminase K, is not well studied for breast cancer diagnosis. However, Friedrich 
and. associates (Friedrich, M., et al, Correlation between immunoreactivity for transglutaminase 
K and for marker^ of proliferation and differentiation in normal breast tissue and breast 
carcinomas. Eur J Gynaecol Oncol, 1998. 19(5): p. 444-8) showed weak to strong membrane 
staining was detected in 17 of 30 breast carcinomas, while 90% of noraialbreast tissue revealed 
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no immiinoreactiYity to TGK. The results suggest that up-regulation of TGK in breast 
carcinomas may play an important role in the regulation of tumor cell invasive properties by 
modulating cell-matrix interactions, or by facilitating the assembly of matrix and tissue 
remodeling.. 

VI. Cervical Cancer 

[0553] A variety of cervical cancer.markers have been discovered to aid physicians in making 
■ timely, precise diagnoses, and to provide significantly better patient management. Unfortunately, 
none of these tumor makers is a "magic bullet" with both high sensitivity and specificity. 
Therefore, alternative ways to enhance diagnostic accuracy are necessary. 
[0554] An alternative way to enhance diagnostic accuracy is to develop a panel comprising a 
plurality of probes each of which specifically binds a marker associated with cervical cancer. All 
candidate probes are to be tested with ICC and or IHC techniques. In some embodiments, 
specimens may be obtained using Pap smears. Conventional Pap smears utilize spatulas and 
brushes to collect cervical cells. For the liquid-based preparation (LBP) Pap tests, cells may be 
collected using a broom, brush, or balloon. For example, Cytyc's ThinPrep® product and a 
TriPath's SurePrep'^'^ product may be used. Additionally Molecular Diagnostics' "e2 
Collector™" may be used for obtaining cervical cells. It is a silicone balloon, shaped like a 
mirror image of the cervix. When inflated against the cervix, cells adhere to the balloon's surface 
and collect endocervix and ectocervix cells in a single step. 

[0555] Once the specimens are collected, the specimens will be processed and analyzed. 
Statistical analysis will be used to design panels, as described above for limg cancer. During 
processing, technical issues such as cell smears or pellets not sticking to slides during harsh 
washings may occur in some embodiments. However, such issues can readily be addressed by 
manipulation of software or modifying staining protocols to mitigate such problems. In some 
embodiments, the specimens will be processed and analyzed using a device that automatically 
samples the specimen and prepares slides for diagnosis. It is anticipated that a broad menu of 
probes will be used initially. The number of probes will be pruned to a suitably sized panel in 
order to retain a high level of sensitivity and specificity. Selection of the final probes will be 
based on a pre-defined threshold of the percentage of positive stained tumor cells. Sophisticated 
statistical analysis will be employed to make these determinations. Since the panel-assay 
approach to detecting malignancies is applicable to solid tumors, and several of the same tumor 
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markers are in different panels, this method may be carried out in parallel, as well as serially.. In 
this manner, the assay development process can be expedited. 

[0356] Epidemiologic studies suggest that carcinoma of the cervix is caused by a sexually 
transmitted agent, and human papillomavirus (HPV) is a prime suspect. Approximately 70" 
genetically distinct types of HPV have been identified. Types 16 and 18, less commonly, types 
31, 33, 35, and 51 are found in approximately 85% of invasive squamous cell cancers and their 
precursors (severe dysplasias and carcinoma in situ). In contrast to cervical cancers, genital 
warts with low malignant potential are associated with 'How risk" types of HPV-6 and HPV-1 1 . 
(Mitchell, R,N, and R.S. Cotran, Neoplasia, mRobbins Pathologic Basis of Disease, R,S. Cotran, 
v.. Kumar, and T. Collins, Editors. 1999, W. B, Saunders Company: Philadelphia, p. 260-327). 
[0557] Molecular diagnosis of HPV infection is more sensitive and specific than conventional 
Pap smears, and has added value in the . evaluation of women with equivocal Pap smear results. 
Digerie's Hybrid Capture II HPV DNA test is highly effective in detecting patients with high- 
grade dysplasia. (Solomon, D., M. Schiffinan, and R. Tarone, Comparison of three management 
strategies for patients with atypical squamous cells of undetermined significance: baseline 
results from a randomized trial] Natl Cancer Inst, 2001. 93(4): p. 293-9), Another HPV method 
is detectionof E6 and E7 proteins of HPV 16 ahdHPV 18 developed by Molecular Diagnostics. 
The oncogenic potential of HPV- 16 and HPV- 1 8 has been related to these two early viral gene 
products. (Huibregtse, J.M. and S.L. Beaudenon, Mechanism ofHPVEC proteins in cellular 
transformation, Semin Cancer Biol, 1996. 7(6): p. 3 17-26) and (zur Hansen, H., Papillomavirus 
and p53. Nature, 1998. 393(6682): p. 217). Roche Diagnostics recently acqiured HPV testing 
patents from Institut Pasteur. 

Library of Probes/Markers 
[0558] Various sources containing information about cancer markers were reviewed. An 
arbitrary criterion of 20% or greater positivity of breast cancer was used to select probes for a 
prefeired panel for detection and/or diagnosis of breast cancer. The term *'20% or greater 
positivity" means that if 100 tumor cases were studied, 20 or more of these cases would have 
sho\vn a presence of the individual marker, while the remaining 80 or fewer cases would not 
have shown a presence of the individual marker. A prefeaed panel may include molecular 
markers selected from Carcinoembrioruc Antigen (CEA), C-erb-B2, Cyclin E, E6/E7, Epidermal 
Growtli Factor Receptor (EGFR), Ki-67, pl6, p53, Proliferating Cell Nuclear Antigen (PCNA), 
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Siirvivin, Telomerase and Vascular Endothelial Growth Factor. A brief description of the library 
of probes/markers utilized in the present example is provided below. 

Carcinoembryonic Antigen (CEA) 
[0559J Tliis carcinoembryonic antigen is a highly glycosylated cell surface protein that is oyer- 
expressed in a variety of human tumors and has been used as a tumor marker for disease 
progression in colorectal cancer patients. Previous reports have found elevated serum CEA 
levels in patients with cervical cancer, although this did not correlate with disease progression 
(Borras, G., et al., Tumor antigens CA 19,9, CA 125, and CEA in carcinoma of the uterine cervix. 
Gynecol Oncol, 1995. 57(2): p. 205-11). Sarandakou and associates (Sarandakou, A, et al, 
Tumour'associated antigens CEA. CA125, SCC and TPS in gynaecological cancer. Eur J 
Gynaecol Oncol, 1998. 19(1): p. 73-7) demonstrated that serum CEA levels in cervical cancer 
patients was significantly increased compared to patients with benign gynecological diseases. 
Another study reported CEA expression increases in cervical intraepithelial neoplasia (CIN III) 
and carcinoma in situ (CIS) by immunohistochemicial staining even though serum CEA levels of 
these patients were not elevated. The results suggest that CEA immunostaining may be more 
sensitive than serum CEA to diagnose cervical dysplasia or cancer at early stage (Tendler, A., 
H.L. Kaufinan, and A.S, Kdidish, Increased carcinoembryonic antigen expression in cervical 
intraepithelial neoplasia grade 3 and in cervical squamous cell carcinoma. Hiun Pathol, iZOOO. 
31(11): p. 1357-62). 

C-erb-B2 

[05601 This oncoprotein is a 185 kD membrane-bound glycoprotein. It is a receptor on- the 
cytoplasmic membrane that is homologous to the epidermal growth factor receptor (c-erb-Bl). 
The c-erb-B2 oncogene was independently discovered by several groups and consequently is 
referred to by various names, including HER2 and neu (Coussens, L., et al., Tyrosine kinase 
receptor with extensive homology to EOF receptor shares chromosomal location with neu 
oncogene. Science, 1985. 230(4730): p. 1132-9; Bargmann, C.I., M.C. Hung, and R.A. 
.Weinberg, The neu oncogene encodes an epidermal growth factor receptor-related proteim 
Nature, 1986. 319(6050): p. 226-30). Over-expression of c-erb-B2 has been demonstrated in 
1 4% to 3 8% of patients with cervical cancer and has been found to be associated with poor 
prognosis (Hale, R.J,, et al., Prognostic value of c-erbBr-l expression in uterine cervical 
carcinoma. J Clin Pathol, 1992. 45(7): p. 594-6). Other studies (Sharma, A., et al. Frequent 
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amplification of C-erbBl (HER-I/Neii) oncogene in cervical carcinoma as detected by non- 
fluorescence in situ hybridization technique on. paraffin sections. Oncology, 1999, 56(1): p. 83-7; 
Mitra, A.B., et al., ERBB2 (HER2/neu) oncogene is frequently amplified in squamous cell 
carcinoma of the uterine cervix. Cancer Res, 1994. 54(3): p, 637-9) found frequent amplification 
of c-erb-B2 in cervical cancer. C-erb-B2 expression is elevated in cervical carcinoma measured 
by enzyme-linked immunosorbent assay (ELISA) and immunohistochemistry (IHC) (Kim, J.W., 
Y.T. Kim, and D.K. Kim, Correlation bet^veen EGFR and c-erbB-l oncoprotein status and 
response to neoadjuvant chemotherapy in cervical carcinoma, Yonsei Med J, 1999. 40(3): p. 
207-14; Ngan, H.Y., et al, Abnormal expression of epidermal growth factor receptor and c- 
erbB2 in squamous cell carcinoma of the cervix: cor relation. with human papillomavirus and 
prognosis. Timiour Biol, 2001. 22(3): p. 176-83). 
CyclinE 

[0561 J Proteins known as cyclins and an associated group of regulatory proteins called cyclin- 
dependent Icinases (CDKs) regulate the key checkpoints. Cyclin E is a 50 kD protein, that 
complexes with CDK2 in late Gl phase of the cell cycle. Carcinogenesis is characterized by 
deregulation of the cell cycle. Alttiough p53 is still the most important cell cycle regulator in 
hiunan malignancies, there is an increased body of evidence indicating that the aberrant 
expression of cyclins and cyclin-dependent kinase (CDIC) inhibitors is considered one of the 
most important events in malignant transformation of various human cancers. Cho and 
associates (Cho, N.H., Y.T. Kim, and J.W. Kini, Correlation between Gl cyclins andHPVin the 
uterine cervix, Int J Gynecol Pathol, 1997. 16(4): p. 339-47) foimd that cyclin E expression was . 
absent in normal cervical epithelium but was significantly higher in HPV-positive cases. 
Another study revealed that patients with either invasive cervical cancer or cervical dysplasia 
have a significantly higher cyclin E index (CEI) than do the control patients (Tae Kim, Y., et al., 
Expression of cyclin E and p2 7 (KIP I) in cervical carcinoma. Cancer Lett, 2000. 153(1-2): p. 41- 
50). 

E6/E7 

[0562] Human papillomavirus (HPV) infection is associated with cervical cancer. El and E2 
papillomavirus proteins are expressed at the early stage of infection and regulate DNA 
replication. The E2 protein activates and represses transcription from different HPV promoters. 
At some stage when viral DNA gets integrated into the cellular genome, the E2 gene is disrupted 
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or inactivated. This event leads to a depression of the E6 and E7 viral oncogenes. E6 andE7 
influence cell proliferation, gene expression, and progression to malignancy (Resales, R., M. 
Lopez-Contreras, and R.R. Cortes*, Antibodies against human papillomavirus (HPV) type 16 and 
18 E2, E6 andE7 proteins in sera: correlation with presence of papillomavirus DMA. J Med , 
Virol, 2001. 65(4): p. 736-44). Studies have shown that in HPV 16 or 18, E6/E7 mRNA was not 
detected in benign cervical disease. However, 60% of cervical adenocarcinoma in situ (ACIS) 
and 24% of cervical adenocarcinoma expressed tlie HPV 16 oncogene, HPV IS oncogene 
expression was detected in 27% of ACIS and in 51% of invasive cervical cancer (Riethdorf, S,, et 
al.. Analysis of HPV 16 and 18 E6/E7 oncogene expression in cevical and endometrila glandular 
neoplasias. Cancer Detection and Prevention, 2000. 24(Supplement 1)). Rosales and associates 
(Rosales, R., M. Lopez-Contreras, and R.R. CortQS, Antibodies against human papillomavirus 
(HPV) type 16 and 18 E2, E6 and E7 proteins in sera: correlation with presence of 
papillomavirus DNA, J Med Virol, 2001. 65(4): p. 736-44) studied 172 women with HPV 
infection and found that antibodies against the E6 and E7 proteins of HPV 16 were found in 52% 
and 37% of the patients, respectively. Antibodies agsiinst the E6 and E7 proteins of HPV 18 
were found in 35% and 45% of the patients, respectively. Another study showed that in HPV16 
and HPV 1 8, E6/E7 proteins were detected in 48% of cervico vaginal washings and 29% of sera 
from patients with cervical cancer uising enzyme-lixiked immunosorbent assay (ELISA). 

Epidermal Growth Factor Receptor (EGFR) 
[0363] Epidennai growth factor receptor (EGFR) is a transmembrane glycoprotein. Binding 
with its ligands initiates a chain of events that result m DNA synthesis, cell proliferation, and cell 
differentiation. Activation of the EGFR has been shown to contribute to the growth and spread 
of many different types of solid tumors. Up-regulation and over-expression of EGFR has been 
correlated with many processes related to cancer, including uncontrolled cellular proliferation 
and prevention of apoptosis (Wu, X., et al, Apoptosis induced by an anti-epidermal growth 
factor receptor monoclonal antibody in a human colorectal carcinoma cell line and its delay by 
insulin, J Clin Invest, 1995, 95(4): p. 1897-905). Many epithelial tumors express higli EGFR, 
which is associated with advanced disease and poor clinical prognosis, including cervical and 
gastric cancers, as well as cancers of the colorectum, head and neck (Salomon, D.S., et al., 
Epidermal growth factor-related peptides and their receptors in human malignancies, Crit Rev 
Oncol Hematol, 1995. 19(3): p. 183-232). Kim and associates (Kim, J.W., et al, Expression of 
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epidermal growth factor receptor in carcinoma of the cervix. Gynecol Oncol, 1996. 60(2): p. 
283-7) showed that overexpression of EGFR was found in 29 of 40 (72%) invasive cervical 
cancers and in 5 of 20 (25%) cervical intraepithelial neoplasia (CIN) patients. Over-expression 
of EGFR appears to be an unfavorable prognostic factor, regardless of the presence of HPV16/18 
(Kedzia, W,, et 2Ll, fImmunohistochemical examination oncogenic c-erb-bl^ egf-r proteins and 
antioncogenic p53 protein in vulvar cancers HPV- 1 6 positive and negative]. Ginekol Pol, 2000, 
71(2):.p.63-9). 

Ki-67 

[0564] This is a nuclear protein expressed in proliferating normal and neoplastic cells. Ki-67 
expression occurs during the phase .of the cell cycle designated as late Gl, S, M and G2. 
However during the GO phase, the antigen cannot be detected (Cattoretti, G., et al. Monoclonal 
antibodies against recombinant parts of the Ki'67 antigen (MIB 1 and MB 3) detect 
proliferating cells in microwave-processed formalin-fixed paraffin sections. J Pathol, 1992. 
168(4): p[ 357-63). Bar and associates (Bar, J.K., et al, Relations between the expression ofp53, 
c-erbB'2, Ki-67 and HP V infection in cervical carcinomas and cervical dysplasias. Anticancer 
Res, iOOl. 21(2A): p. 1001^6) documented that HPV infection, especially accompanied by 
increase of proliferative activity in dysplasias may define the cell subpopulation predisposed to 
mahgnant process development. This is supported by results indicating Ki-67 activity is found in 
a higher percentage of patients who are HPV-positive than HPV-negative with carcinomas and 
dysplasias. 

pl6 

[0565] This gene is a cyclin-dependent Idnase inhibitor (CDKI) and it may negatively regulate 
the cell cycle by acting as a tumor suppressor. Cervical dysplasia is induced by persistent 
infections through high-risk types of human papillomaviruses (HPVs). Outgrowth of dysplastic 
lesions is triggered by increasing expression of two viral oncogenes, E6 and E7, which both 
interact with various cell cycle regulating proteins. Among these is the retinoblastoma gene 
product pRB, which is inactivated by E7. The pRB product inhibits transcription.of the cyclin- 
dependent kinase inhibitor gene pl 6(INK4a). Increasing expression of viral oncogenes in 
dysplastic cervical cells might be reflected by increased expression of pl6(IhfK4a), Sano and 
associates (Sano, T., et d,l,Immunohistochemical overexpression of pi 6 protein associated with 
intact retinoblastoma protein expression in cervical cancer and cervical intraepithelial 
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neoplasicL Pathol Int, 1998. 48(8): p. 580-5) demonstrated that strong inununo-reactivi'ty for the • 
pl6 protein was observed in both nuclei and cytoplasm of all CIN and invasive cancer cases 
except several low-grade CIN lesions. Studies also showed that overexpression of pl6(INK4a) is 
a specific marker for dysplastic and neoplastic epithelial cells of the cervix and pi 6, along with 
Ki-67 and cyclin E, are complimentary surrogate biomarkers for HP V-related cervical neoplasia 
(Klaes, R., et al, Overexpression of pl6(INK4A) as a specific marker for dysplastic and 
neoplastic epithelial cells of the cervix uteri. Int J Cancer, 2001. 92(2): p. 276-84; Keating, J.T., 
T. Ince, and CP. Cam, Surrogate biomarkers of HP V infection in cervical neoplasia screening 
and diagnosis. Adv Anat Pathol, 2001 , 8(2): p. 83-92). 
p53 

[0566] This is one of the well-known tiimor-stippressor genes. The p53 protein is present in 
minute amounts in normal cells and tissues, but high concentrations occur in a large number of 
tumors and titmor cell lines. Hence, it can be detected by immunohisto-chemistry (nuclear 
staining). The increased concentration in tumor cells may be caused by complexing with other 
proteins, or by mutation of the p53 protein. The gene for p53 is located on chromosome 17p, a 
frequent site of allele loss in many tumors. Vassallo and associates (Vassallo, J., et al., High risk 
HP V and p53 protein expression in cervical intraepithelial neoplasia, Int J Gynaecol Obstet, 
2000. 71(1): p. 45-8) documented that p53 protein overexpression in CIN is associated with high 
risk HPV infection. By using Western blot analysis and immunohistochemistry, rearrangement 
of the p53 gene with overexpressed p53 proteins were found in primary cervical cancer (Sahu, 
G.R., et al, Rearrangement ofp53 gene with overexpressed p3 3 protein in primary cervical 
cancer Oncol Rep, 2002. 9(2): p: 433-7). 

Proliferating Cell Nuclear Antigen (PCNA) 
[0567] This proliferating cell nuclear aptigen (PCNA) is a cofactor for DNA polymerase delta. 
PCNA is expressed in both S phase of the cell cycle and during periods of DNA synthesis 
associated with DNA repair. PCNA is expressed in proliferating cells in a wide range of nomial 
and malignant tissues. The location of PCiNA is nuclear. Kobayashi and others demonstrated 
there was intimate correlation between the PCNA and mitotic indexes in severe dysplasia and 
carcinoma in situ (CIS) (Kobayashi, I., et aL, The proliferative activity in dysplasia and 
carcinoma in situ of the uterine cervix analyzed by proliferating cell nuclear antigen 
immunostaining and silver-binding argyrophilic nucleolar organizer region staining: Hum 
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Pathol, 1994. 25(2): p. 198-202; Smela, M.,,M. Chosia, and W. Domagala, Proliferation cell 
nuclear antigen (PCNA) expression in cervical intraepithelial neoplasia (CIN). An 
immunohistochemical study. Pol J Pathol, 1996: 47(4): p. 171-4). Another study showed PCNA 
appears to be a better marker of immunoreactivity for CIN than Ki-67 (Maeda, M.Y., et al., 
Relevance of the rates of PCNA, Kl'67 and p53 expression according to the epithelial 
compartment in cervical lesions. Pathologica, 2001 . 93(3): p. 1 89-95). 
Survivin 

[0568] Ttiis is a 142 amino acid protein and is expressed in the G2/M phase of the cell cycle. It 
is an inhibitor of apoptosis that is selectively overexpressed in common human cancers, but not 
in normal tissues, and correlates with aggressive disease and unfavorable outcomes (Lehner, R., 
et al, Immunohistochemical localization of the MP protein survivin in bladder mucosa and 
transitional cell carcinoma. Appl Immunohistochem Mol Morphol, 2002. 10(2): p. 134-8).- 
Survivin mRNA is detected in cervical cancer tissue (Saitoh, Y.,. Y. Yaginuma, andM. Ishikawa, 
Analysis ofBcl-2, Box and Survivin genes in uterine cancer. Int J Oncol, 1999. 15(1): p. .137-41). 
Immunohistp-chemical localization of survivin in benign cervical mucosa, cervical dysplasia, and 
invasive squamous cell carcinoma showed that nuclear staining was detected in normal mucosa, 
low-grade dysplasia, and high-grade dysplasia. Staining intensity was greatest in cases with 
morphologic evidence of HPV infection (Frost, M., et al., Immunohistochemical localization of 
survivin in benign cervical mucosa, cervical dysplasia, and invasive squamous cell carcinoma. 
Am J Clin Pathol, 2002. 117(5): p. 738-44). 
Telomerase 

[0569] • This is a ribonucleorotein enzyme that extends and maintains teloihefes of eukaryotic 
chromosomes. Those cells that do not express telomerase have successively shortened telomeres 
with each ceil division, which ultimately leads to chromosomal instability, aging and cell death. 
It has been hypothesized that infection^ with high-risk hiunan papillomavinises (HPVs), in 
conjunction with other cellular events, plays a critical role in the development of cervical cancer. 
Activation of the telomerase enzyme complex that synthesizes telomere repeats has been 
associated with acquisition of immortal phenotype in vitro and is conimonly observed in human 
cancers (Anderson, S., et al., Telomerase activation in cervical cancer. Am J Pathol, 1997. 
151(1): p. 25-3 1). Studies have shown that telomerase is exclusively present in cervical 
carcinomas and a subseit of cervical intraepithelial neoplasia grade IE lesions, but not in nomial 
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cervical tissues. Activation of telomerase appears to be associated with high-risk-HPV infection, 
accumulation of inactive p53 proteins and increased cell proliferation, in cervical lesions 
(Snijders, PJ., et ah, Telomerase activity exclusively in cervical carcinomas and a subselof 
cervical intraepithelial neoplasia grade III lesions: strong association with elevated messenger 
RNA levels of its catalytic siibunit and high-risk human papillomavirus DNA. Cancer Res, 1998. 
58(17): p, 3812-8; Nair, P,, et al., Telomerase, p53 and human papillomavirus infection in the 
uterine cervix. Acta Oncol, 2000. 39(1): p. 65-70). Another study shows that telomerase activity 
was detected in 96% of cervical tiimor samples and in 69% of pre-malignant cervical scrapings, 
but not detected in control hysterectomy. samples and in cervical scrapings of normal healthy 
patients. This indicates telomerase is a very sensitive and specific molecular marker for cervical 
cancer screening (Reddy, V.G., et al., Telomerase-A mo lecidar, marker for cervical cancer 
screening. Int J Gynecol Cancer, 2001. 1 1(2): p. 100-6). 

Vascular Endothelial Growth Factor (VEGF) 
[0570] Vascular endothelia growth factor (VEGF) is an important angiogenesis factor and an 
endothelial cell-specific mitogen, Angiogenesis plays a critical process in the latter stages of 
carcinogenesis and tumor progression, and is particularly important in the development of distant 
metastasis. VEGF is known to be one of the most important inducers of angiogenesis and is 
upregulated in carcinoma of the cervix. Ldpez-Ocejo and associates (Lopez-Ocejo, O., et al, 
Oncogenes and tumor angiogenesis: the HPV-16 E6 oncoprotein activates, the vascidar 
endothelial growth factor (VEGF) gene promoter in a p53 independent manner. Oncogene, 2000. 
19(40):p. 4611-20) demonstrated that HPV-16 E6-positive cells generally express high levels of. 
the VEGF message and suggest a possibihty that the HPV oncoprotein, E6, may contribute to 
timor angiogenesis by direct stimulation of the VEGF gene. Another study demonstrated that 
expression of VEGF is involved in tlie promotion of angiogenesis in cervical cancer and plays an 
important role in early cancer invasion (Kodama, J., et al., Vascidar endothelial growth factor is 
implicated in early invasion in cervical cancer, Eur J Cancer, 1999. 35(3): p. 485-9). 

VIL Summary of Examples I- VI 
[0571] Examples I- VI described above provide preferred probes/markers to be included in 
panels for detecting and/or diagnosing lung, colorectal, bladder, prostate, breast, and cervical 
cancer. Figures 8a-c provide a summary of the preferred probes/markers. for each.cancer type. 
Figures 8a-c also identify which markers are useful for generic cancer detection utility as well as 
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those that are more valuable due to their specificity for a particular cancer type. For example, 
EGFR and Ki-67 are useful for generic cancer detection. Whereas, BL2-10D1, CD44v3, 
CoLlagenase, COX-1, HLA-DR, HSP-90, IL-6, IL-IO, Lewis X, NMP-22, TGF-pi, TGF-II; 
TGFIII and UBC are useful for detection and/or diagnosis of bladder cancer; AE1/AE3, BCA- 
225, BRCA-1, CA45.3, Cathespin D, GCDFP-15, H0X-B3, p65, PR and TGK are useful for 
detection and/or diagnosis of breast cancer, Cyclin E, E6 and E7 are usefiil for detection and/or 
diagnosis of cervical cancer, AKT, amphiregulin, p-catenin, Bax, BPG, Cdl</2/cdc2, cFLEP, 
Cripto-I, Ephrin-B2, Ephrin-B4, Fas-L, HMGI(Y), hMLHl, Lysozyme, Matrilysin, p68, S100A4 
and YB-1 are usefiil for detection and/or diagnosis of colorectal cancer; C-MET, Cyclin A, FGF- 
2, Glut-1, Glut-S, HERA, IvLAGE-1, MAGE-3, Mucin 1, Nm23, pl20, SP-l, SP-B, 
Thrombomodulin and TTF-1 are useful for detection and/or diagnosis of lung cancer; and 
34PE12, B72.3. FAS, ED-l, Kallidrein 2, Leu 7, P504S, PAP, PIP and PSA are useful for 
detection and/or diagnosis of prostate cancer. 



186 



wo 2004/025251 



PCT/US2003/028379 



WHAT IS CLAIMED IS: 

1. A panel for detecting a generic disease state or discriminating bet^^ 
specific disease states using cell-based diagnosis, comprising a plurality of probes each of which 
specifically binds to a marker associated With a generic or specific disease state, wherein the 
pattern of binding of the component probes of the panel to cells in a cytology specimen is 
diagnostic of the presence or specific nature of said disease state. 

2. The panel of claim 1, wherein said generic disease state is selected firom 
the group consisting of cancer and infectious diseases. 

3 . The panel of claim 2, wherein said cancer is selected fi'om the group 
consisting of epithelial cell-based cancers, solid tiunor-based cancers, secretory tumor based 
cancers, and blood based cancers. 

4. The panel of claim 2 wherein said infectious disease is selected from the 
group consisting of cell-based diseases in which the infectious organism is a virus, bacterium, 
protozoan, parasite, or fungus. 

5. The panel of claim 1, wherein said panel is optimized by using weighting 
factors selected fi*om the group consisting of cost, prevalence of a generic disease state in a 
geographic, location, prevalence of a specific disease state in a geographic location, availability of 
probes and commercial considerations. 

6. The panel of claim 1, wherein each of said probes comprises a detectable 

label. 

7. The,panel of claim 6, .wherein said probes comprise antibodies. 

8: The panel of claim 6, wherein said label is selected firom the group 
consisting of a chromophore, a fluorophore, a dye, a radioisotope and an enzyme. 

9. The panel of claim 8, wherein said label is a chromophore detected using 
electromagnetic radiation selected from the group consisting of beta rays> gamma rays, X rays, 
ultraviolet radiation, visible light, infirared radiation and microwaves. 

1 0. The panel of claim 1 , wherein said pattern of bindmg is detected using 
photonic, microscopy . 
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11. The panel of claim 1 0, wherein said photonic microscopy utilizes at least 
one electromagnetic radiation selected from the group consisting of gamma rays, X rays, beta 
rays, ultraviolet radiation, visible light, infrared radiation and microwaves. 

12. Thepanelofclaim l, wherein said detecting is for sexually transmitted 
diseases and said discriminating is between chdamydia, trichomonas, gonorrhea, herpes and 
syphilis. 

13. A method of forming a panel for detecting a disease state or discriminating 
between disease states in a patient using cell-based diagnosis, comprising: 

(a) determining the sensitivity and specificity of binding of probes each of 
which specifically binds to a member of a library of markers associated with a disease state; and 

(b) selecting a limited plurality of said probes whose pattern of binding is 
diagnostic for the presence or specific nature of said disease state. 

1 4. The method of claim 13, wherein said determining comprises: 

(a) separately contacting a histological or cytological sample firom a patient 
known to be suffering from said disease and a histological or cytological sample from a patient 
known not to be suffering from said disease with each of said probes; 

(b) measuring the amount of specific binding of each probe with itis 
complementary disease marker at loci where said marker is known to be present in cells of said 
samples; and 

(c) correlating each said amount with the presence or specific nature of said 

disease. 

15. The method of claim 13, wherein said selecting comprises one or more of 
statistical analytical methods, pattern recognition methods and neural network analysis. 

16. Themethodofclaim 13, where said selecting comprises the use of 

weighting factors. 

• 1 7. A method of detecting a disease or discriminating between disease states 

comprising: 

(a) contacting a cytological sample suspected of containing abnormal cells 
characteristic of a disease state with a panel according to claim 1 ; and 

. (b) detecting a pattern ofbindihgofsaid probes that is diagnostic for the 

presence or specific nature of said disease state. 

1&8. 
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18.. The method of claim 17, wherein said cyto logical sample is a cellular 
sample collected from a body fluid, an epithelial cell-based organ system, a fine needle aspiration 
or a biopsy. 

19. The method of claim 18, wherein said cytological sample is sputum. , 

20. A panel for detecting a generic disease state or discriminating between 
specific disease states using cell-based diagnosis, wherein said panel is formed according to the 
method of claim 13. 

21 . The panel of claim 1, wherein said disease marker is selected from the 
group consisting of a morphologic biomarker, a genetic biomarker,' a cell cycle biomarker, a " 
molecular biomarker and a biochemical biomarker. 

22. The panel of claim 3, wherein said epithelial cell-based cancer is from the 
pulmonary, urinary, gastrointestinal or genital tract. 

23. The panel of claim 3, wherein said solid tumor-based cancer is selected 
from the group consisting of a sarcoma, breast cancer, pancreatic cancer, liver cancer, kidney 
cancer, thyroid cancer, and prostate cancer. 

24. The panel of claim 3, wherein said secretory timior-based cancer is 
selected from the group consisting of a sarcoma, breast cancer, pancreatic cancer, liver cancer, 
kidney cancer, thyroid cancer, and prostate cancer. 

25. The panel of claim 3, wherein said blood-based cancer is selected from the 
group consisting of leukemia and lymphoma. 

26. The method of claim 1 8, wherein said body fluid is selected from the 
group consisting of blood, urine, spinal fluid and lymph. 

27. The method of claim 18, wherein said epithelial cell based organ system is 
selected from the group consisting of the pulmonary tract, the urinary tract, the genital tract and 
the gastrointestinal tract. 

28. The method of claim 18, wherein said final needle aspiration is from solid 
tissue types m organs and systems. 

29. The method of claim 1 8, wherein said biopsy is from solid tissue types in 
organs and systems. 
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30.' The method of claim 2S, wherein said organs and systems are selected 
from the group consisting of breast, pancreas, liver, kidney, thyroid, bone marrow, muscle, 
prostate and lung: 

.31. The panel of claim.21 , wherein said morphologic biomarlcer is selected 
from the group consisting of DNA ploidy, MACs. and premalignant lesions. • • 

32. The panel of claim 21, wherein said genetic biomarlcer is selected from the 
group consisting of DNA adducts, DNA mutations and apoptotic indices. 

33. The panel of claim 21, wherein said cell cycle biomarker is selected from 
the group consisting of cellular prohferation markers, differentiation markers, regulatory 
molecules and apoptosis markers. 

34. The panel of claim 2 1 , wherein said molecular biomarker or biochemical 
biomarker is selected from the group consisting of oncogenes, tumor suppressor genes; tumor 
antigens, growth factors and receptors, enzymes, proteins, prostaglandins and adhesion 
molecules. 

35. The method of claim 29, wherein said organs and systems are selected 
from the group consisting of breast, pancreas, liver, kidney, thyroid, bone marrow, muscle, 
prostate and lung. 
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Figure 8a 
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Figure 8b 
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Figure 8c 



Marker 


Di3QQQr 


&r&ast 


wG/yicaj 


LrOioreciai 


Lung 


rfoSiate 


Canc8r 


Cancer 


Cancer 


Cancer 


Cancer 


Cancer 


Glut-3 










X 




HERA 










X 




MAGE-1 










X 




MAGc'3 










X 




Nm23 










X 




P120 










X 




• • SP-A 










X 




■ SP-B 










X 




rnrombomodulin 










X 




TTF-1 










X 




, 34(3E12 












X 


B72.3 












X 


FAS 












X 


ID-1 












X 


Kallikrein 2 












X 


Leu? 












X 


P504S 












X 


PAP 












X 


• PfP 












X 


PSA 












X 



wo 2004/025251 



PCT/US2003/028379 



References:. 

[I] Goldberg-Kahn, B., Healy, J.C. and Bishop, J.W. (1997) The cost of diagnosis: a ' 
comparison of four diifferent strategies in the workup of solitary radiographic lung lesions. Chest 
.111,870-6. 

[2] O'Donoyan, P.B, (1997) The radiologic appearance of lung cancer. Oncology (Huntingt) 
11, 1387-402; discussion 1402-4. 

[3 ] Worrell, J. A. (1 995) Radiology of the central airways. Otolaryngol Clin North Am 28i 
701-20. 

[4] Hensclike, C.L, Miettinen, Q.S., Yanlcelevitz, D.F., Libby, DM. and Smith, J.P. (1994) 
Radiographic screening for cancer. Proposed paradigm for requisite research. Clin Imaging 18, 
16-20. , - . 

[5] Lam, S., Kennedy, T., Unger, iVI., iMiller, Y.E., Gelmont, D., Rusch, V., Gipe, B., 
Howard, D., LeRiche, J.C, Goldman. A. and Gazdar, A.F. (1998) Localization of bronchial 
intraepithelial neoplastic lesions by fluorescence bronchoscopy. Chest 1 13, 696-702. 

[6] Sazon, D.A., Santiago, S.M., Soo Hoo, G.W., Khonsary, A., Brown, C, Mandelkem, M,, 
Blahd, W. and Williams, AJ. (1996) Fluorodeoxyglucose-positron emission tomography in the 
detection and staging of lung cancer. Am J Respir Crit Care iVIed 153, 417-21. 

[7] Lowe, V.J., DeLong, D.M., Hoffinan, J.M. and Coleman, R.E. (1995) Optimum scanning 
protocol for FDG-PET evaluation of puhnonary malignancy. J Nucl Med 36, 883-7. 

[8] Lowe, V.J., Fletcher, J. W., Gobar, L., Lawson, M., Kirchner, P., Valk, P., Karis, J., 
Hubner, K., Delbeke, D., Heiberg, E.V., Patz, E.F. and Coleman, R.E. (1998) Prospective 
investigation of positron emission tomography in lung nodules, J Clin Oncol 16, 1075-84. 

[9] Raab, S.S.,- Homberger,' J, and Raffin, T. (1 997) The importance of sputum cytology in 
the diagnosis of limg cancer: a cost-effectiveness analysis. Chest 1 12, 937-45. 

[10] Franklin, W.A. (1998) New molecular and cellular approaches tb lung cancer detection. 
In: Biology of Lung Cancer, pp. 529-570. 

[I I] Kern, W.H. (1988) The diagnostic accuracy of sputum and urine cytology. Acta Cytol 32, 
651-4. 

[12] Mehta, A.C., Marty, J.J. and Lee, F.Y. (1993) Sputiun cytology. Clin Chest Med 14, 69- 
85. 

[13] Gledhill, A., Bates, C, Henderson, D., DaCosta, P. and Thomas, G. (1997) Sputum 
cytology: a limited role, J Clin Pathol 50, 566-8. 

[14] Steffee, C.H., Segletes, L.A. and Geisinger, K.R. .( 1 997) Changing cytologic and 
histologic utilization patterns in the diagnosis of 515 primary lung malignancies. Cancer 81, 105- 



192 



wo 2004/025251 



PCT/US2003/028379 



15. 

[15] Zaman, M.B. (1991) Pulinonary cytology. Clin Lab Med 1 1, 293-315. 

[16]. Flehinger, B . J. and Melamed, M,R. ( 1 994) Cuixent status of screening for king cancer. 
Chest Surg Clin N Am 4, MS. 

[17] Koss, L.G., Melamed, M.R. and Goodner, J.T. (1964) Pulmonary cytology: A brief . 
survey of diagnostic results from July 1st, 1952 until December 3 1st, 1960. Acta Cytol 8, 104. 

[18] Saccomanno, G., Saunders, R.P., Ellis, H,, Archer, V.E., Wood, B.G. and Beckler, P.A. 
(1963) Concentration of carcinoma or atypical cells in sputum. Acta Cytol 5, 305ol0. . 

[19] Miura,H.,Konaka,C;, Kawate,N.,Tsuchida,T. and Kato,H. (1992) Sputiim c 
positive, bronchoscopically negative adenocarcinoma of the lung [see comments]. Chest 102, 
1328^32, 

[20] Valatis, J., Warrens, D, and Gamble, D. (1981) Increased incidence of adenocarcinoma of 
the lung. Cancer 47, 1042-1046: 

[21] Baldini, E.H. and Strauss, G.M. (1997) Women and limg cancer: waiting to exhale. Chest 
112,229S-234S. 

[22] . Caldwell, C.J. and Berry, C.L. (1996) Is the incidence of primary adenocarcinoma of the 
limg increasing? Virchows Arch 429, 359-63. 

[23] Risse, E.K., Vooijs, G.P. and van't Hof, M.A. (1987) Relationship between the cellular . 
composition of sputum and the cytologic diagnosis of lung cancer. Acta Cytol 31, 170-6. 

. [24] Holiday, D.B., McLarty, J. W., Farley, M.L., Mabry, L.C, Cozens, D., Roby, T., 
Waldron, E., Underwood, R.D,, Anderson, E., Culbreth, W. and et al. (1995) Sputum cytology 
within and across laboratories, A reliabihty study. Acta Cytol 39, 195-206. 

[25] Eddy, D'.M. (1989)' Screening for lung cancer [see comments]. Aim Intern Med 111, 232- 

7. ' ■ ■ • ■ 

[26] Younes, M., Brown, R.W., Stephenson, M., Gondo, M. and Cagle, P.T. (1997) 
Overexpression of Glut 1 and Glut3 in stage I nonsmall cell lung carcinoma is associated with 
poor survival. Cancer 80, 1046-51. 

[27] Ogawa, J., Inoue, H; and Koide, S. (1997) Glucose-transporter-type-I-gene amplification 
correlates with sialyl- Lewis-X synthesis and proliferation in lung cancer. Int J Cancer 74, 189^ 
92. 

[28] Ito, T., Noguchi, Y., Satoh, S., Hayashi, H., Inayama, Y. and Kitamura, H. (1998) 
Expression of facilitative ghicose transporter isoforms in lung carcinomas: its relation to 
histologic type, differentiation grade, and tumor stage [see comments]. Mod Pathol 11", 437-43, 

[29] . Sosolik,R.C., McGaughy,V.R. and De Young, B.R, ( 1997) Anti-MOC-31: a potential . 



193 



wo 2004/025251 



PCT/US2003/028379 



addition to the palmonaiy adenocarcinoma versus mesothelionia immunohistochemistry panel. 
Mod Pathol 10,716-9. 

[30] Ordonez, N.G. ( 1 998) Value of the MOC-3 1 monoclonal antibody in differentiating 
epithelial pleural mesothelioma from lung adenocarcinoma. Hum Pathol 29, 166-9. 

[3 1] Talcanami, L, Tanaka, F., Hashizume, T., Kilcuchi, K., Yamamoto, Y., Yamamoto, T. and 
Kodaira, S. (1996) The basic fibroblast growth factor and its receptor in pulmonary 
adenocarcinomas: an investigation of their expression as prognostic markers. Eur J Cancer 32A, 
1504-9, 

[32] Takanami, I., Imamura, T., Hashiziune, T., Kikuchi, K., Yamamoto, Y., Yamamoto, T. 
and Kodaira, S. (1996) Inamunohistochemical detection of basic fibroblast growth factor as a 
prognostic indicator in pulmonary adenocarcinoma. Jpn J Clin Oncol 26, 293-7. 

[33] Ohta, Y., Endo, Y., Tanaka, M„ Shimizu, J., Oda, M., Hayashi, Y., Watanabe, Y. and 
Sasaki, T. (1996) Significance of vascular endothelial growth factor messenger RNA expression 
in primary lung cancer, Chn Cancer Res 2, 141 1-6 1996. 

[34] Vohn, M., Koomagi, R'., Mattem, J, and Stammler, G. (1997) Prognostic value of basic 
fibroblast growth factor and its receptor (FGFR-1) in patients with non-small cell lung 
carcinomas. Eur J Cancer 33, 691-3. 

[35] Hiyama, K., Hiyama, E., Ishioka, S., Yamakido, M„ Inai, K., Gazdar, A.F., Piatyszek, 
•M.A. and Shay, J.W. (1995) Telomerase activity in small-cell and non-small-cell lung cancers 
[see comments], J Natl Cancer Inst 87, 895-902. ... 

[36] Yashima, K., Litzky, L.A., Kaiser, L., Rogers, T., Lam, S., Wistuba, E, Milchgrub. S., 
Srivastava, S., Piatyszek, M.A „ Shay, J.W.. and Gazdar, A J. (1997) Telomerase expression in 
respiratory epithelium during the multistage pathogenesis of limg carcinomas. Cancer Res 57, 
2373-7. . 

[37] Ahrendt,'S.A., Yang, S.C., Wu, L.*, Westra, W.H., Jen, J., Califano,'J.A.'and Sidransky, 
D. (1997) Comparison of oncogene mutation detection and telomerase activity for the molecular 
staging of non-small cell lung cancer. Clin Cancer Res 3, 1207-14. 

[38] Albanell, J., Lonardo, F., Rusch, V., Engelhardt, M., Langenfeld, J., Han, W., Kiimstra, 
D.,.Venkatraman, E., iMoore, M.A. and Dmitrovsky, E. (1997) High telomerase activity in. 
primary lung cancers: association with increased cell prohferation rates and advanced pathologic 
stage. J Natl Cancer Inst 89, 1609-15. 

[39] Hiyama, K., Ishioka, S., Shay, J.W., Taooka, Y., Maeda, A., Isobe, T,, Hiyama, E., 
Maeda, H: and Yamakido, M. (1998) Telomerase activity as a novel marker of lung cancer and 
inmiune- asspciated lung diseases. Int J Mol Med 1 , 545-9. 

[40] Yahata, N., Ohyashiki, K., Ohyashiki. J.H., Iwama, H., Hayashi, S., Ando, K., Hirano, T., 
Tsuchida, T., Kato, h;, Shay, J.W. and Toyama, K. (1998) Telomerase activity in lung cancer 



194 



wo 2004/025251 



PCt/US2003/028379 



cells obtained from bronchial washings. J Natl Cancer Inst 90, 684-90. 

[41] Lee, J.C., Jong, H.S.. Yoo, CO., Han, S.K., Shim. Y.S. and Kim, Y. W. (^998) 
Teldraerase activity in lung cancer cell lines and tissues. Lung Cancer 21, 99-103. 

[42] Arai, T., Yasuda, Y, Takaya, T., Ito, Y., Hayakawa, K., Toshima, S., Shibuya, C, 
Yoshimi, N. and Kashiki. Y. (1998) Application of telomerase activity for screening of primary 
lung cancer in broncho-alveolar lavage fluid. Oncol Rep 5, 405-8. 

[43] . Fujii, M., Motoi, M., Saeki, H., Ace, K. and Moriwaki, S. (1993) Prognostic significance 
of proliferating cell nuclear antigen (PCNA) expression iii non-small cell lung cancer. Acta Med 
Okayama 47, 103-8. 

[44] Kawai, T., Suzuki, M., Kono, S., Shinomiya, N., Rokutanda, M., Takagi, K., Ogata, T, 
and Tamai, S. (1994) Proliferating cell nuclear antigen and Ki-67 in lung carcinoma. Correlation 
with DNA flow cytometric analysis. Cancer 74, 2468-75. 

[45] . Ogawa, J., Tsurumi, T., Yamada, S.. Koide, S. and Shohtsu, A. (1994) Blood vessel 
invasion and expression of sialyl Lewisx and proliferating cell nuclear antigen in stage I non- 
small cell lung cancer. Relation to postoperative recurrence. Ciancer 73, 1 177-83. 

[46] Ebina, M., Steinberg, S.M., Mulshine, J.L. and Linnoila, R.I. (1994) Relationship of p53 
overexpression and up-regulation of proliferating cell nuclear antigen with the clinical course of 
non-small cell lung cancer. Cancer Res 54, 2496-503. 

[47] Fontanini, G.. Vignati, S., Bigini, D., Merlo, G.R., Ribecchini,. A., Angeletti, C.A., 
Basolo, F., Pingitore, R. and Bevilacqua, G. (1994) Human non-small cell lung cancer. p53 
protein accumulation is an early event and persists during metastatic progression [see comments]. 
J Pathol 174,23-31. 

[48] Wiethege, T., Voss, B. and Muller, K.IVI. (1995) P53 accumulation and proliferating-cell 
nuclear antigen expression in human lung cancer. J Cancer Res Clin Oncol 121, 371-7. 

[49] Bsposito, v., Baldi, A., De Luca, A., Micheli, P., Mazzarella, G., Baldi, F., Caputi, M. 
and Giordano, A. (1997) Prognostic value ofp53 in non-small cell lung cancer: relationship with 
proliferating cell nuclear antigen and cigarette smoking. Hum Pathol 28 , 233-7. 

[50] Caputi, M., Esposito, V., Groger, A.M., PacUio, C, Murabito, M., Dekan, G., Baldi, F., 
Wolner, E. and Giordano, A. (1998) Prognostic role of proliferating cell nuclear antigen in lung 
cancer: an inunimohistochemical analysis. In Vivo 12, 85-8. 

[51] Hirata, T.. Fukuse, T.. Naiki, H.. Hitomi, S. and Wada, H. (1998) Expression of CD44 
variant exon 6 in stage I non-small cell lung carcinoma as a prognostic factor. Cancer Res. 58, 
1108-10. ■ 

[52] Ariza, A., Mate, J.L., Isamat, M., Lopez, D., Von Uexkull-Guldeband, C, Rosell, R., 
Fernandez- Vasalo. A. and Navas-Palacios, J.J. (1995) Standard and variant CD44 isoforais are 
commonly expressed in hmg cancer of the non-small cell type but not of the small cell type. J 



195 



wo 2004/025251 



PCT/US2003/028379 



Pathol 177, 363-8. 

[53] Fasano, iVL, Sabatini, M.T,, Wieczorek, R., Sidhu, G., Goswami, S. and Jagirdar, J. 

(1997) CD44 and its v6 spliced variant in lung tumors: a role in histogenesis? Cancer 80, 34-41. 

[54] Miyoshi, T., Kondo, K., Hino, N., Uyama, T. and Monden, Y. (1997) The expression of 
the CD44 variant exon 6 is associated with lymph node metastasis m non-small cell lung cancer. 
Clin Cancer Res, 3, 1289-97. 

[55] Tran, T.A., Kallakury, B.V., Sheehan, C.E. and Ross, J.S. (1997) Expression of CD44 
standard form and variant isoforms in non-small cell lung carcinomas. Hum Pathol 28, 809-14. 

[56] Takigawa, N., Segawa, Y., Mandai, K., Takata, L and Fiijimoto, N. (1997) Serum CD44 
levels in patients with non-small cell.lung cancer and their relationship with clinicopathological 
features. Lung Cancer 18, 147-57. 

[57] Kondo, K., Miyoshi, T., Hino, N:, Shimizu, E., Masuda, N., Takada, M., Uyama, T. and 
Monden, Y. (1998) High frequency expressions of CD44 standard and variant forms in non- 
small cell limg cancers, but not in small cell lung cancers. J Surg Oncol 69, 128-36. 

[58] Sasald, J.L, Tanabe, K.K., Takahashi, K., Okaraoto, L, Fujimoto, H., Matsumoto, M., 
Suga, M., Ando, M. and Saya, H. (1998) Expression of CD44 splicing isoforms in lung cancers: 
dominant expression of Cp44v8-10 in non-small cell lung carcinomas. Int J Oncol 12, 525-33. 

[59] Volm, M., Koomagi, R., Mattem, J. and Stammler, G. (1997) Cyclin A is associated with 
.an unfavourable outcome in patients with non-small-cell lung carcinomas, Br J Cancer 75, 1774- 
8. • 

[60] Dobashi, Y., Shoji, M., Jiang, S.X., Kobayashi, M., Kawalcubo, Y. and Kameya, T. 

(1998) Active cyclin A-CDK2 complex, a possible critical faptor for cell proliferation in human . 
primary liing carcinomas. Am J Pathol 153, 963-72. 

[61] Vohn, M., Rittgen, W. and Drings, P. (1998) Prognostic value of ERBB-1, VEGF, cyclin 
A, FOS, JUN and MYC in patients with squamous cell lung carcinomas [pubUshed erratum 
appears in Br J Cancer 1998 Apr;77(7):l 198]. Br J Cancer 77, 663-9. 

[62] Shoji, M., Dobashi, Y., Morinaga, S' , Jiang, S.X. and Kameya, T. (1999) Tumor 
extension and cell proUferation.in adenocarcinomas of the lung. Am J Pathol 154, 909-18. 

[63] Shapiro, G.I., Edwards, CD., Kobzik, L., Godleski, J., Richards, W., Sugarbaker, DJ, 
and Rollins, B.J. (1995) Reciprocal Rb inactivation and pl6INK4 expression in primary lung 
cancers and cell lines. Cancer Res 55, 5D5-9, 

[64] Betticher, D.C., Heighway, J., Hasleton, P.S., Altermatt, H.J., Ryder, W.D,, Cemy, T. and 
Thatcher, N. (1996) Prognostic significance of CCNDl (cyclin Dl) oyerexpression in primary 
resected non-small-cell limg cancer. Br J Cancer 73, 294-300. 

[65] Mate, J.L., Ariza, A., 'Aracil, C, Lopez, D., Isamat, M., Perez-Piteira, J. and Navas- 
Palacios, J. J. .( 1 996) Cyclin D 1 overexpression in non-small cell lung carcinoma: 



196 



wo 2004/025251 



PCT/US2003/028379 



correlation with Ki67 labelling index and poor cytoplasmic differentiation. J Pathol 180, 395-9. 

[66] Yang, W.L, Chung, K.Y., Shiri, D.H. and Kim, Y.B. (1996) Cyclin Dl protein expression 
in lung cancer. Yonsei Med J 37, 142-50. 

[67], Betticher, D.C., Heighvvay,J„ Thatcher, N. and Hasleton, P.S. (1997) Abnormal 
expression of CCNDl and RBI in resection margin epithelia of lung cancer patients. Br J Cancer 
75,1761-8'. 

[68] Nishio, M., Koshikawa, T., Yatabe, Y., Kuroishi, T., Suyama, M., Nagatalce, M., Sugiura, 
T., Ariyoshi, Y., Mitsudomi, T. and Takahashi, T. (1997) Prognostic significance of cyclin Dl 
and retinoblastoma expression in combination with p53 abnormalities in primary, resected non- 
small cell lung cancers. Clin Cancer Res 3, 1051-8. 

[69] Caputi, M., De Luca, L., Papaccio, G., A, D.A., Cavallotti, L, Scala, P., Scarano, F.,, 
Manna, M., Gualdiero, L: and De Luca, B. (1997) Prognostic r9le.of cyclin Dl in non small cell 
lung cancer: an immunohistochemical analysis. Eur J Histochem 41, 133-8. 

[70] Betticher, D.C., White, G.R., Vonianthen, S., Liu, X,, Kappeler, A,, Altennatt, H.J., 
Thatcher, N. and Heighway, J. (1997) Gl control gene status is freqiiently altered in resectable 
non-small cell limg cancer. Int J Cancer 74, 556-62. . 

[71] Vohn, M., Koomagi, R. and Rittgen. W. (1998) Clinical implications of cyclins, cyclin- 
dependent l<inases,.RB and E2F1 in squamous-cell lung carcinoma. Int J Cancer 79, 294-9. 

[72] Kurasono, Y,, Ito, T., Kameda, Y., Nakamura, N. and Kitamura, H. (1998) Expression of 
cyclin D 1 , retinoblastoma gene protein, and p 1 6 MTS 1 protein in atypical adenomatous 
hyperplasia and adenocarcinoma of the lung. An immunohistochemical analysis. Virchows Arch 
432,207-15. 

[73] Tanaka, H., Fujii, Y,, Hirabayashi, H., Miyoshi, S., Sakaguchi, M., Yoon, H.E. and 
Matsuda, H. (1998) Dismption of the RB pathway and cell-proliferative activity in non- small- * 
cell lung cancers. Int J Cancer 79, 1 1 1 -5. 

[74] Olivero, M., Rizzo, M., Madeddu, R., Casadio, C, Pennacchietti, S., Nicotra, M.R., Prat, 
M., Maggi, G., Arena, N., Natali, P.G., Comoglio, P.M. and Di Renzo, M.F. (1996) 
Overexpression and activation of hepatocyte growth factor/scatter factor in human non-small-cell 
lung carcinomas. Br J Cancer 74, 1862-8. 

[75] Harvey, P., Warn, A., Newman, P., Perry, L.J., Ball, R.Y. and Warn, R.M. (1996) 
Immunoreactivity for hepatocyte growth factor/scatter factor and its receptor, met, in human lung 
carcinomas and mahgnant mesotheliomas. J Pathol 180, 389-94. 

[76] Takanami, L, Tanana, F., Hashizume, T., Kikuchi, K., Yamamoto, Y., Yamamoto, t. and 
Kodaira, S . ( 1 996) Hepatocyte growth factor and c-Met/hepatocyte growth factor receptor in 
pulmonary adenocarcinomas: an evaluation of their expression as prognostic markers. Oncology 
53,392-7. 



197 



wo 2004/025251 



PCT/US2003/028379 



[77] Siegfried, J,M., Weissfeld, L.A, Luketich, J.D., Weyant, RJ., Giibish, CT. and 
Landreneaa, R.J. (1998) The clinical significance of hepatocyte growth factor for non-small cell 
lung cancer. Ann Thorac Surg 66, 1915-8. 

[78] Nguyen, P.L., Niehans, G.A., Cherwitz, D.L., Kim, Y.S. and Ho, S.B. (1996) Membrane- 
bound (MUCl) and secretory (MUC2, MUC3, and MUC4) mucin gene expression in human 
lung cancer Tumour Biol 17, 176-92. 

[79] Yu, C.J., Yang, P.C., Shun, C.T;, Lee, Y.C., Kuo, S.H. and Luh, K.T. (1996) 
Overexpression of MUC5 genes is associated with early post-operative metastasis in non-small- 
cell lung cancer. Int J Cancer 69, 457-65. 

[80] Yu, C J., Shun, C.T., Yang, P.C., Lee, Y.C., Shew, J.Y., Kuo, S.H. and Luh, K,T. (1997) 
Sialomucin expression is associated with erbB-2 oncoprotein overexpression, early recurrence, 
and cancer death in non-small-cell lung cancer [published erratum appears in Am J Respir Crit 
Care iMed 1997 Aug;156(2 Pt l):677-8]. Am J Respir Crit Care Med 155, 1419-27. 

[81] Jarrard, J.A., Linnoila, R.L, Lee, H., Steinberg, S.M., Witschi, H. and Szabo, E. (1998) 
MUCl is a novel marker for the type II pneumocyte lineage during lung carcinogenesis. Cancer 
Res 58, 5582-9. 

[82] Ohgami, A., Tsuda, T., Osaki, T., Mitsudomi, T., Morimoto, Y., Higashi, T. and 
Yasumoto, K. (1999) MUCl mucin mRNA expression in stage I lung adenqcarcinoma audits 
association with early recurrence. Ann Thorac Surg 67, 810-4. 

[83] Bejarano. P.A., Baughman, R.P., Biddinger, P. W., Miller, M.A., Fenoglip-Preiser, C, al- 
Kafaji, B., Di Lauro, R. and Whitsett, J.A. (1996) Surfactant proteins and thyroid transcription 
factor- 1 in pulmonary and breast carcinomas. Mod Pathol 9, 445-52. 

[84] Harlamert, H.A., Mira, J'., Bejarano, P.A., Baughman, R,P., Miller, M.A., Whitsett, J.A. 
and Yassin, R. (1998) Thyroid transcription factor-1 and cytokeratins 7 and 20 in' pulmonary and 
breast carcinoma. Acta Cytol 42, 1382-8, 

. [85] Fontanini, G., Vignati, S., Lucchi, M;, Mussi, A., Calcinai;. A;, Boldrini, L:; Chine, S., 
Silvestri, V,, Angeletti, C.A., Basolo, F. and Bevilacqua, G. (1997) Neoangiogenesis and p53 
protein in lung cancer: their prognostic role and then- relation with vascular endothelial growth 
factor (VEGF) expression [see comments]. Br J Cancer 75, 1295-301 . 

[86] Shibusa, T., Shijubo, N. and Abe, (1998) Tiunor angiogeiiesis and vascular endothelial 
growth factor expression in stage I lung adenocarcinoma. Clin Cancer Res 4, 1483-7. 

[87] Giatromanolaki, A., Koukoiurakis, M.L, Kakolyris, S., Turley, H., O'Byme, K., Scott, 
P.A., Pezzella, F., Georgoulias, V„ Harris, A.L. and Gatter, K.C. (1998) Vascular endothelial 
growth factor, wild-type p53, and angiogenesis in early operable non-small cell lung cancer. Clin 
Cancer Res 4, 3017-24, 

[88] Fontanini, G., Boldrini, L., Vignati, S., Chine, S., Basolo, F„ Silvestri, V., Lucchi, M., 
Mussi, A., Angeletti, C.A. and Bevilacqua, G. (1998) Bcl2 and p53 regulate vascular endothelial 



198 



wo 2004/025251 



PCt/US2003/028379 



growth factor (VEGF)- mediated angiogenesis in non-small cell lung carcinoma/Eur J Cancer 
34,718-23. 

[89] . Takahama^ M., Tsutsumi, M., Tsujiuchi, t!, Kido, A., Okajima, E., Nezu, K., Tojo, T., 
Kushibe, K., Kitamiira, S . and Konishi, Y. ( 1 998) Frequent expression of the vascular endothelial 
growth factor in human non-smail-cell hmg cancers. Jpn.J Clin Oncol 28, 176-81. 

[90] Sozzi, G., Miozzo, M., Taghabue, E., Calderone, C, Lombardi, L., Pilotti, S,, Pastorino, 
U., Pierotti, M.A. and Delia Porta, G. (1991) Cytogenetic abnormalities and overexpression of 
receptors for growth factors in normal bronchial epithehum and tumor samples of lung cancer 
patients. Cancer Res 51, 400-4. 

[91] Vohn, M,, Efferth, T., Mattem, J. and Wodrich, W. (1992) Overexpression of o-fos and c- 
erbBl encoded proteins, in squamous cell carcinomas of the lung of smokers. Int J Oncol 1, 69-71 
1992. 

[92] Wodrich, W. and Vohn, M. (1993) Overexpression of oncoproteins in non-small cell lung 
carcinomas of smokers. Carcinogenesis . 1 4, 1 12 1 -4. 

[93] Pastorino, U., Sozzi, G., Miozzo, M., Tagliabue, E., Pilotti, S. and Pierotti, M.A; (1993) 
Genetic changes in lung cancer. J Cell Biochem Suppl 17F, 237-48. 

[94] Gorgoulis, V., Sfikakis, P.P., Karameris, A., Papastamatiou, H., Trigidou, R., Veslemes, 

M., Spandidos, D.A., Sfikakis, P. and Jordanoglou, J. (1995) Molecular and 
immunohistochemieal study of class I growth factor receptors in squamous cell lung carcinomas. 
Pathol Res Pract 191, 973-81, 

[95] Rusch, v., Klimstra, D., Linkov, 1. and Dmitrovsky, E. (1995) Aberrant expression of p53 
or the epidermal growth factor receptor is frequent in early bronchial neoplasia and cpexpression ■ 
precedes squamous cell carcinoma development. Cancer Res 55, 1365-72. 

[96] Rusch, V.W. and Dmitrovsky, E. (1995) Molecular biologic features of lion-small cell 
lung cancer Clinical implications: Chest Surg Clin N Am 5, 39-55. 

[97] Fontanini, G., Vignati, S., Bigini, D., Mussi, A,, Lucchi, H., Angeletti, C.A., Pingitore, 
R., Pepe, S., Basolo, F. and Bevilacqua, G. (1995) Epidermal growth factor receptor (EGFr) 
expression in non-small cell lung carcinomas correlates with metastatic involvement of hilar and 
mediastinal lymph nodes in the squamous subtype. Eur J Cancer 3 1 A, 178-83. 

[98] Pflug, B. and Djakiew, D, (1996) Expression of the low affinity nerve growth factor 
receptor in prostate epithelial cells negatively regulates nerve growth factor- mediated growth via 
induction of apoptosis (Meeting abstract). Proc Annu Meet Am Assoc Cancer Res 37, A262 
1996. . ■ 

[99] Rusch, V„ Klimstra, D., Venkatraman, E., Langenfeld, J., Pisters, P. and Dmitrovsky, E, 
(1996) Overexpression of EGFR and TGF-alpha is frequent in early stage non- small cell lung 
cancer, but does not predict tumor progression (Meeting abstract). Proc Annu Meet Am Assoc 



199 



wo 2004/025251 



PCT/US2003/028379 



Cancer Res 37, Al 3 14 1996. 

[100] Fujino, S., Enoldbori, T., Tezulca, N., Asada, Y., Inoue, S., Kato, H. and Mori, A. (1996) 
A comparison of epidermal gro\vth factor receptor levels and other prognostic parameters in non- 
small ceil lung cancer. Eur J Cancer 32A, 2070-4. 

[101] Pastorino, U., Andreola, S., Tagliabue, E., Pezzella, F., Incarbone, M., Sozzi, G., Buyse, 
M., Menard, S., Pierotti, M. and Rilke, F. (1997) Immunocytochemical markers in stage I lung 
cancer: relevance to prognosis. J Clin Oncol 15, 2858-65. 

[102] Sekine, L,.Takami, S., Guang, S.G., Yokose, T., Kodama, T., Nishiwaki, Y., Kinoshita, 
M., Matsumoto, H., Ogura, T. and Nagai, K. (1998) Role of epidennal growth factor receptor 
overexpression, K-ras point mutation and c-myc amplification in the carcinogenesis of non-small 
celliung cancer. Oncol Rep 5, 351-4. 

[103] Pfeiffer, P., Nexo, E., Bentzen, S.M., Clausen, P.P., Andersen, K., Rose, C. and Nex, E. 

(1998) Enzyme-linked inamunpsorbenL assay of epidermal growth factor receptor in lung cancer: 
comparisons with immunohistochemistry, clinicopathological features and prognosis. Br J 
Cancer 78,96-9, 

[104] D'Amico, T.A., Massey, M., Hemdon, I.E., 2nd, Moore, M.B. and Harpole, D.H., Jr. 

(1999) A biologic risk model for stage . I lung cancer: immunohistochemical analysis of 408 
patients with the Use of ten molecular markers. J Thorac Cardiovasc Surg 1 17, 736-43. 

[105] Engel, M„ Theisinger, B., Seib, T., Seitz, G., Huwer, H., Zang, K.D., Welter, C. and 
Dooley, S. (1993) High levels of nm23 -HI and nm23-H2 messenger RNA in human squamous- 
cell lung carcinoma are associated with poor differentiation and advanced tumor stages. Int J 
Cancer 55, 375-9. 

[106] Ozeki, Y., Takishima, K, and Mamiya, G. (1994) Immunohistochemical analysis of 
nm23/NDP kinase expression in human lung adenocarcinoma: association with tiimor 
progression in Clara cell type. Jpn J Cancer Res 85, 840-6. 

[107] Lai, W.W„ Wu, M.H„ Yah"; J.J. and Chen, F.F. (1996).Inmiunohistochemical analysis of 
nm23-Hl in stage I non-small cell lung cancer: a useflil marker in prediction of metastases. Ann 
Thorac Surg 62, 1500-4. 

[108] Gazzeri, S., Brambilla, E., Negoescu, A., Thoraval, D., Veron, M., Moro, D, and 
Brambilla, C. (1996) Overexpression of nucleoside diphosphate/kinase A/nin23-Hl protein in 
human lung tumors: association with tumor progression in squamous carcinoma. Lab Invest 74, 
158-67. 

[109] MacKinnon, M., Kerr, K.M., King, G., Kennedy, M.M., Cockbum, J.S. and Jeffrey. R.R. 
(1 997) p53, c-erbB-2 and nm23 expression have no prognostic significance in primary 
pulmonary adenocarcinoma. Eur J Cardiothorac Surg 11, 838-42. 

[110] Bosnar, M.H., Pavehc, K., Krizanac, S:, Slobodnjak, Z. and Pavelic, J. (1997) Squamous 



200 



wo 2004/025251 



PCT/US2003/028379 



cell lung carcinomas: the role of nm23-Hl gene. J Mol Med 75, 609-13. 

[Ill] Kawakubo, Y., Sato. Y., Koh, T., Kono, H, and Kameya, T'. (1997) Expression of nm23 
protein in pulmonary adenocarcinomas: inverse correlation to tumor progression. Lung Cancer 
17,103-13.. 

[112] Ritter, J.H., Dresler, CM. and Wick, M.R. (1 995) Expression of bcl-2 protein in stage 
TINOMO non-small cell limg carcinoma. Hum Pathol 26, 1227o2. 

[1 13] Kitagawa, Y., Wong, F., Lo, P., Elliott, M., Verburgt, L.M., Hogg, J.C. and Daya, M. 
(1996) Overexpression of Bcl-2 and mutations in p53 and K-ras in resected hiunau non-small cell 
lung cancers. Am J Respir Cell Mol Biol 15, 45-54. 

[1 14] Rao, S.K., Krishna, M., Woda, B.A., Savas, L. and Fraire, A.E. (1996) 
Irmmmohistochemical detection of bcl-2 protein in adenocarcinoma and non-neoplastic cellular 
compartments of the lung. Mod Pathol 9, 555-9. 

[1 1 5] Boers, J.E., ten Velde, G.P. and Thunnissen, F.B. (1 996) P53 in squamous metaplasia: a 
marker for risk of respiratory tract carcinoma. Am J Respir Crit Care Med 153, 41 1 -6. 

[1 16] Coppola, D., Clarke, M., Landreneau, R., Weyant, R.J., Cooper, D. and Yousem, S.A. 

(1996) Bcl-2, p53, CD44, and CD44v6 isoform expression in netiroendocrine tumors of the lung. 
Mod Pathol 9, 484-90. 

[1 17] Higashiyama, M., Doi, O., Kodama, K., Yokouchi, H. and Tateishi, R. (1996) Bcl-2 
oncoprotein expression is increased especially in the portion of small cell carcinoma within the 
combined type of small cell lung cancer. Tumour Biol 17, 341-4.. 

[118] Strauss, G.M. (1997) Prognostic markers in resectable non-small ceU lung cancer. 
Hematol Oncol Clin North Am 11, 409-34. 

[119] Anton, R.C.. Brown, .R.W., Younes, M., Gondo, M.M., Stephenson, M.A. and Cagle. P.T. 

(1997) Absence of prognostic significance of bol-2 immunopositivity in non- small cell lung 
cancer: analysis of 427 cases. Hum Pathol 28, 1079-82. 

[120] Ishida, H., Irie, IC, Itoh, T., Fvirukawa, T. and Tokimaga, O. (1997) The prognostic 
significance of p53 and bcl-2 expression in lung.adenocarcinoma and its correlation with Ki-67 
growth firaction, Cancer 80, 1634-45. 

[121] Stefanaki, K., Rontogiannis, D., Vamvoulca, C, Bolioti, S., Chaniotis, V., Sotsiou, F., 
Vlychou, M., DeUdis, G., Kakolyris, S., Georgouhas, V. and Kanavaros, P. (199S) 
Immimohistochemical detection of bcl2, p53, mdm2 and p21/wan proteins in small-cell lung 
carcinomas. Anticancer Res 18, 11.67-73. 

[122] Brambilla, E., Gazzeri, S., Lantuejoul, S., Coll, J.L., Moro, D., Negoescu, A. and 
Brambilla, C. (1998) p53 mutant inraiunophenotype and deregulation of p53 transcription 
pathway (Bcl2, Bax, and Wafl) in precursor bronchial lesions of lung cancer. Clin Cancer Res 4, 
1609-18. 



201 



wo 2004/025251 



PCT/US2003/028379 



[123] Salgia, R. aiid Skarin, A.T. (1998) Molecular abnormalities in lung cancer. J Clin Oncol 
16, 1207-17. 

[124] Kim, Y.C., Park, K.O., Kern, J.A., Park, C.S., Lim, S.C., Jang, A.S. and Yang, J.B. 
(1998) The interactive effect of Ras, HER2, P53 and Bci-2 expression in predicting the survival 
of non-small cell lung cancer patients. Lung Cancer 22, 181-90. 

[125] Groeger, A.M., Caputi, M., Esposito, V., De Luca, A., Salat, A., Murabito, M., Giordano, 
G.G., Baldi, F., Giordano, A. and Wolner, E. (1999) Bcl-2 protein expression correlates with 
nodal staitus in non small cell lung cancer. Anticancer Res 19, 821-4. 

[126] Vargas, S.O., Leslie, K.O., Vacek, P.M., Socinski, M.A. and Weaver, D.L. (1998) 
Estrogen:receptor-related.protein p29 in primary nohsmall cell lung carcinoma: pathologic and 
prognostic correlations. Cancer 82, 1495-500. 

[127] Higashiyama, M., Doi, 0., Kodama, K., Yokouchi, H. and Tateishi, R. (1994) 
Retinoblastoma protein expression in lung cancer: an immunohistochemical analysis. Oncology 
51,544-51. 

[128] Xu, H.J., Quinlan, p.C., Davidson, A.G. and et.al. (.1994) Altered retinoblastoma protein 
expression and prognosis in early stage non-small cell lung barcinoma. J. Natl. Cancer Inst. 86, 
695-699. ■ 

[129] Lee, J.S., Kalapurakal, S., Ro, J.Y. and Hong, W.K. (1995) Prognostic significance of 
retinoblastoma protein expression in non- small cell lung cancer (Meeting abstract). Proc Annu 
Meet Am Assoc Cancer Res 36, A3787 1995. 

[130] Dixon, G., Salisbury, J, and Walker, C. (1995) Expression of the retinoblastoma protein 
in normal and dysplastic bronchial epithelium and lirag cancer (Meeting abstract). J Pathol 176, 
32A1995. 

[131] Shapiro, G.L, Edwards, CD., Kobzik, L., Godleski, J., Richards, W., Sugarbaker, D.J. 
and RoUiiis, B.J. (1995) Reciprocal Rb inactivation and pl6 expression in lung cancer (Meeting 
abstract). Proc Annu Meet Am Assoc Caricei: Res.-36, A164 1995. 

[132] Volm, M. and Stammler, G. (1996) Retinoblastoma (Rb) protein expression and 
resistance in squamous cell lung cjffcinoraas. Anticancer Res 16, 891-4. 

[133] Dosaka-AMta, H., Hu, S.X., Kinoshita, L, Fujino, M., Hararia, M.. Kawakami, Y. and 
Benedict, W.F. (1996) Prognostic significance of Rb protein expression in non-small cell lung 
cancer (NSCLC) (Meeting abstract). Proc Annu Meet Am Assoc Cancer Res 37, A1401 1996. 

[134] BCinoshita, I., Dosaka-Akita, H., Akie, K., Mishina, T., Hiroiuni, H. and Kawakami, Y. 
(1996) Significance of abnormal pl6INK4 and RB protein expression in non- small cell lung 
cancer (NSCLC) (Meeting abstract). Proc Annu Meet Am Assoc Cancer Res 37, A3979 1996. 

[135] l-Cratzke, RA., Greatens, T.M., Rubins, J.B., Maddaus, M.A., Niewoehner, D.E., Niehans, 
'G.A. and Geradts, J. (1996) Rb and pl6INK4a expression in resected non-small ceU lung tumors. 



202 



wo 2004/025251 



PCT/US2003/028379 



Cancer Res 56, 3415-20. 

[136] Sakagiichi, M., Fujii. Y., Hirabayashi, H., Yoon, H.E., Komoto, Y., Oue, T., Kusafuka, 
T., Okada, A. and Matsuda, H. (1996) Inversely correlated expression of pi 6 and Rb protein in 
non-small cell lung cancers: an inrniunohistochemical study. Int J Cancer 65, 442-5. 

[137] Xu, H J., Cagle, P.T., Hu, S.X., Li, J. and Benedict, W.F. (1996) Altered retinoblastoma * 
and p53 protein status in non-small cell carcinoma of the lung: potential synergistic effects on 
prognosis. Clin Cancer Res 2, 1 169-76 1996, 

[138] Dosaka-Akita, H., Hu, S.X., Fujino, M., Harada, M., Kinoshita, I.;'Xu, H.J., Kuzumaki, 
N., Kawakami, Y. and Benedict, W.F. (1997) Altered retinoblastoma protein expression in. 
nonsmall cell lung cancer: its synergistic effects with altered ras and p53 protein status on 
prognosis. Cancer 79, 1329-37. 

[139] Cagle, P.T., el-Naggar,. A.R., Xu, HJ., Hu, S.X. and Benedict, W.F. (1997) Differential 
retinoblastoma protein expression in neuroendocrine tumors of the lung. Potential diagnostic 
implications. Am J Pathol 1 50, 393-400. 

[140] Kashiwabara, K., Oyama, T., Sano, T., Fukuda, T, and Nakajima, T. (1998) Correlation 
between methylation status of the pl6/CDKN2 gene and the expression of pl6 and Rb proteins in. 
primary non-small cell lung cancers. Int J Cancer 79, 2 1 5-20. 

[141] Caputi, M., Esposito, V., Groger, A.M., De Luca, A., Pacilio, C, Dekan, G., Giordano, 
G.G., Baldi, F., Wolner, E. and Giordano, A. (1998) RB growth control evasion in lung cancer. 
Anticancer Res 18, 2371-4. 

[142] Tamura, A., Matsubara, 0., Hirokawa, K. and Aoki. N. (1993) Detection of 
thrombomodulin in human lung cancer cells. Am J Pathol 1 42, 79-85 . 

[143] Tamura, A., Komatsu, H., .Hebisawa, A., Kurashima, A., Mori, M. and Katayama, T. 
(1996) Is thrombomodulin useful as a tumor marker of a lung cancer? Lung Cancer 15, 189-95. 

[144] Collins, C.L., Ordonez, N.G., Schaefer, R.. Cook, CD., Xie, S.S., Granger, J., Hsu, P.L., 
.Fink, L. and Hsu, S.M. (1992) Thrombomodulin expression in malignant pleural mesothelioma 
and pulmonary adenocarcinoma. Am J Pathol 141 , 827-33. 

[145] Hamatake, M., Ishida,'T., Mitsudorai, T., Akazawa, K. and Sugimachi, K, (1996) 
Prognostic value and clinicopathological correlation of thrombomodulin in.squamous cell 
carcinoma of the human lung. Clin Cancer Res 2, 763-6 1 996. 

[146] Ordonez, N.G. (1997) Value of thrombomodulin inmiunostaining in the diagnosis of 
mesothelioma. Histopathology 31, 25-30. 

[147] Tolnay, E.,.Wiethege, T. andMuller, K.M. (1997) Expression and localization of 
thrombomodulin in preneoplastic bronchial lesions and in lung cancer. Virchows Arch 430. 209- 
12. • • . 



203 



wo 2004/025251 



PCT/US2003/028379 



[148] Bohm>l.,Totzeck,B. and Wieland, I. (1994) Differences of E-cadherin expression 
levels and patterns in human lung cancer.' Ann Hematol 68, 8 1 -3. 

[149] Bolim, M., Totzeck, B., Birchmeier, W. and Wieland, I. (1994) Differences of E-cadherin 
expression levels and patterns in primary and metastatic human lung cancer. Clin Exp Metastasis 
12, 55-62. 

[1 50] Peraita Soler,- A., Knudsen, K.A., Jaurand, M.C., Johnson, K.R., Wheelock, M. J., Klein- 
Szanto, A.J. and Salazar, H. (1995) The differential expression of N-cadherin and E-cadherin 
distinguishes pleural mesothehomas from lung adenocarcinomas [see comments]. Hum Pathol 
26,1363-9. 

[151] Han, A.C., Peralta-Soler, A., BCnudsen, K.A., Wheelock, M.J., Johnson, K.R. and Salazar, 
H. (1997) Differential expression of N-cadherin in pleural mesotheliomas and E- cadherin in lung 
adenocarcinomas in fonnalin-fixed, par^m-embedded tissues [see comments]. Hum Pathol 28, 
641-5. 

[152] Weynants, P.. Lethe, B., Brasseur, F., Marchand, M. and Boon, T. (.1994) Expression of 
mage genes by non-small-cell lung carcinomas. Int J Cancer 56, 826-9. 

[1 53] Shichijo, S., Hayashi, A., Takamori, S., Tsunosue, R., Hoshino, T., Sakata, M., 
Kuramoto, T., Oizumi, K. and Itoh. K. (1995) Detection of MAGE-4 protein in lung cancers. Int 
J Cancer 64, 158-65. 

[154] Sakata, M. (1996) Expression of MAGE gene family io lung cancers. Kurume Med J 43, 
55-61.. 

[155] Fischer, C, Gudat, F., Stulz, P., Noppen, C, Schaefer, C, Zajac, P., Trutmann, M., 
Kocher, T., Zuber, M., Harder, F., Heberer, M. and SpagnoU, G.C. (1997) High expression of 
MAGE-3 protein in squamous-cell limg carcinoma [letter]. Int J Cancer 71, 1 1 19-21. 

[1 56] Gotoh. K., Yatabe, Y.,.5ugiura, T., Takagi, K., Ogawa, M., Takahashi, T. and Mitsudomi,- 
T. (1998) Frequency of MAGE-3 gene expression in HLA-A2 positive patients with non-small 
cell lung cancer. Lung Cancer 20, 1 17-25. 

[157] Uchiyama, B., Saijo, Y., Kumano, N., Abe, T., Fujimura, S., Ohkuda, K., Handa, M., 
Satoh, K. and Nukiwa, T. (1997) Expression of nucleolar protein pl20 in human lung cancer: 
difference in histological type's as a marker for proliferation. Clin Cancer Res 3, 1873-7. 

[158] Singh, G.. Scheithauer, B.W. and Katyal, S.L. (1986) The pathobiologic features of 
carcinomas of type II pneumocytes. An immunocytologic study. Cancer 57, 994-9. 

[159] Mizutani, Y., Nakajima, T., Morinaga, S., Gotoh, M., Shimosato, Y., Akino, T. and 
Suzuki, A. (1 988) ■immunohistochemical localization of pulmonary surfactant apoproteins in 
various lung tumors. Special reference to nonmucus producing.lyng adenocarcinomas. Cancer 
61,532-7. 

[160] Noguchi, M., Nakajima, T., Hirohashi, S., Akiba. T. and Shimosato, Y. (1989) 



204 



wo 2004/025251 



PCT/US2003/028379 



Immunohistochemical distinction of malignant mesothelioma from pulmonary adenocarcinoma 
\yith anti-surfactant apoprotein, anti- Lewisa, and anti-Tn antibodies. Hum Pathol 20, 53-7, 

[161] Linnoila, R.L, Jensen, S,M., Steinberg, S.M., Mulshine, J.L., Eggleston, J.C. and Gazdar, 
A.F: (1992) Peripheral airvvay cell marker expression in non-small cell lung carcinoma. 
Association with distinct clinicopathologic features [see comments]. Am J Clin Pathol 97, 233- 
43. . * • ' 

[162] Slaijubo, N., Tsutahara, S., Hirasawa, M., Takahashi, H., Honda, Y., Suzuki, A., Kuroki, 
Y. and AJdno, T, (1992) Pulmonary surfactant protein A in pleural effxisions. Cancer 69, 2905-9. 

[163] Shijubo, N., Honda, Y., Fujishima, T., Takahashi, H., Kodama, T., Kurolci, Y., Akino, T. 
and Abe, S. (1995) Lung surfactant protein-A and carcinoembryonic antigen in pleural effusions 
due to lung adenocarcinoma and malignant mesothelioma. Eur Respir J 8, 403-6. 

[164] Nicholson, A.G., McCormick, C.J., Shimosato, Y., Butcher, D.N. and Sheppard, M.N. 
(1995) The value of PE-lO, a monoclonal antibody against pulmonary surfactant, in 
distinguishing primary and metastatic limg timiours. Histopathology 27, 57-60. 

[165] I<aioor, A., Whitsett, J.A.,Stahhnan,M.T. andHdter, S.A. (1997)Expre^^^^ 
surfactant protein B precursor. and surfactant protein B mNA in adenocarcinoma of the lung. 
Mod Pathol 10, 62-7. 

[166] Saitoh, H., Shimura, S.,- Fushimi, T., Okayama, H. and Shirato, K. (1997) Detection of 
surfactant protein-A gene transcript in the cells from pleural etHision for the diagnosis of lung 
adenocarcinoma. Am J Med 103, 400-4. 

[167] Grohse/a/., Acta Cytologica, 1996, 40(l):26-30. 

[168] Grohs 5/ a/,, Acta Cytologica, 1997, 41(1): 144-152. 



205 



